InMemoryUploadedFile may have a bug

onyeibo · December 4, 2024, 9:03am

I am troubleshooting an anomaly in the way a generator from InMemoryUploadedFile.chunks() decodes files.

I created five variants of CSV files from Microsoft Excel 2013:

sample-mac.csv
sample-dos.csv
sample-comma.csv
sample-tab-delimited.csv (actually saves as .txt but a change of extension doesn’t hurt)
sample-unicode.csv

Excel provides options for saving in those formats. The last encodes with "utf-16" by default. The rest are plain texts (ascii). The plain text variants differ in terms of the end-of-line character ("\r" for mac, and "\r\n" for others created on an MS-Windows system). Plain text files (including csv files) created on Linux use only "\n" for EOL.

So I have a file which is an instance of InMemoryUploadedFile and I get a iterable generator containing all the chunks from that file:

chunks = file.chunks()
# where chunks is a generator

and I take the text in the first chunk for sampling:

sampler = next(chunks)

The sampler is still a binary text at the moment. So I decode the text and observe …

print(sampler.decode(charset))
# where charset is ascii (None implies utf-8)

OBSERVATION:
The string characters in the sampler decode properly (as expected) except for sample-mac.csv (the first file sample). Somehow, somewhere … the chunks() mangle or truncate the string such that the final output is a miserable (small) version of the intended. Some characters are lost! Why? It only happens in sample-mac.csv.

Now here is the bummer:
when I open and read the same file directly via a python shell, it reads perfectly.

file = open("sample-mac.csv", "r")
read = file.read()
print(read)
file.close()

The above code prints out everything – same encoding and all. So python reads the same file properly but something messes with the file when it passes through InMemoryUploadedFile.chunks(). Does anyone have an explanation for this?

onyeibo · December 7, 2024, 12:31pm

I found the source of the anomaly. I am sorry to say it is another false alarm . The issue depends on the terminal emulator. It appears “foot” terminal prints “\n\r” delimiters in unexpected ways. The printed output is misleading whereas the actual content is correct.

Topic		Replies	Views
Help with importing .csv files Using Django	4	1013	February 2, 2021
admin csv import csv.Error: iterator should return strings, not bytes Using Django	4	8448	November 7, 2020
Updating the file contents of a Django FileField during upload results in I/O error Mystery Errors	1	1187	June 24, 2023
Can't parse InMemoryUploadedFile object in Django using API Using Django	25	10748	August 12, 2021
calculate hash of file while adding causes "I/O on closed file error" Forms & APIs	2	67	October 17, 2025

InMemoryUploadedFile may have a bug

Related topics