Currently the File.open method has only a single parameter
mode can be set to suggest opening a file in text mode. Then, Python’s
open will by default attempt to use the common utf-8 encoding.
It is a common pattern in a code base I work on to use the
utf-8-sig encoding for CSV files, as this provides a slight improvement to user experience when opening the file in Excel. Right now, to create a file in this encoding using Django’s File, we have to open it in a binary mode and explicitly wrap the file handle in a codec. It would be helpful if, just like the base Python’s
File.open also accepted an
encoding parameter, and pass it to the Python’s
I believe that this kind of enhancement is natural, as Python developers are used to
open allowing to set encoding, especially in Python 3. Actually, I was a bit surprised not seeing it already implemented.
I’ve opened a ticket about this issue and I am willing to work on this feature if there is chance of it being merged.
David Sanders has suggested in the ticket to have a full
*args, **kwargs being allowed and passed directly to
open. I admit though I personally like being explicit in the offered API—especially as it wouldn’t be that easy to later reproduce the full scope of such API in libraries like django-storages. And so I would focus only on
I think it makes sense to match the api of
open where possible.
My only concern here is adding API that all users need to understand and decide about when only a very small subset will ever use it.
What does the wrapping in a codec actually look like? Is that really too much to bare for a rarely needed option?
with some_django_file.open("wb") as fh:
encoded_fh = io.TextIOWrapper(fh, encoding="utf-8-sig")
… use encoded_fh instead of fh …
There are two risks associated with this piece of code: (1) someone accidentally uses
fh, as opposed to
encoded_fh. (2) flush() is forgotten. Mitigation by creating a custom helper context manager also has a drawback of developers forgetting about its existence and using raw Django’s File.open instead.
I wonder how many users of Python’s
open function use the
encoding argument. Obviously Python has more developers than Django, but the standard library is also larger. Still, when we are writing Python code outside of Django, we take this feature for granted.
+1 from me.
I think matching the
open builtin is more important than expanding the API surface area. It’s like how
pathlib.Path.open supports the full
Plus, seeing the argument may be a small reminder that not all text files use UTF-8.
Yep, OK. +1 (just for clarity)
I’m also +1 on this, and I would second the proposal in the ticket where it’s suggested to pass
open to avoid gatekeeping.