InMemoryUploadedFile may have a bug

I am troubleshooting an anomaly in the way a generator from InMemoryUploadedFile.chunks() decodes files.

I created five variants of CSV files from Microsoft Excel 2013:

  1. sample-mac.csv
  2. sample-dos.csv
  3. sample-comma.csv
  4. sample-tab-delimited.csv (actually saves as .txt but a change of extension doesn’t hurt)
  5. sample-unicode.csv

Excel provides options for saving in those formats. The last encodes with "utf-16" by default. The rest are plain texts (ascii). The plain text variants differ in terms of the end-of-line character ("\r" for mac, and "\r\n" for others created on an MS-Windows system). Plain text files (including csv files) created on Linux use only "\n" for EOL.

So I have a file which is an instance of InMemoryUploadedFile and I get a iterable generator containing all the chunks from that file:

chunks = file.chunks()
# where chunks is a generator

and I take the text in the first chunk for sampling:

sampler = next(chunks)

The sampler is still a binary text at the moment. So I decode the text and observe …

print(sampler.decode(charset))
# where charset is ascii (None implies utf-8)

OBSERVATION:
The string characters in the sampler decode properly (as expected) except for sample-mac.csv (the first file sample). Somehow, somewhere … the chunks() mangle or truncate the string such that the final output is a miserable (small) version of the intended. Some characters are lost! Why? It only happens in sample-mac.csv.

Now here is the bummer:
when I open and read the same file directly via a python shell, it reads perfectly.

file = open("sample-mac.csv", "r")
read = file.read()
print(read)
file.close()

The above code prints out everything – same encoding and all. So python reads the same file properly but something messes with the file when it passes through InMemoryUploadedFile.chunks(). Does anyone have an explanation for this?

I found the source of the anomaly. I am sorry to say it is another false alarm :man_facepalming:t5:. The issue depends on the terminal emulator. It appears “foot” terminal prints “\n\r” delimiters in unexpected ways. The printed output is misleading whereas the actual content is correct.

1 Like