calculate hash of file while adding causes "I/O on closed file error"

Hi all,

I’m trying to compute a hash of file contents so that I can reprocess the file if it has changed, but I’m falling over at square one: handle the uploaded file without screwing up the final ‘save()’.

This is my ‘clean_file’ function:

    def clean_file(self):
        file_upload = self.cleaned_data.get("file")
        hasher = hashlib.sha256()
        with file_upload.open('rb') as f:
            for chunk in iter(lambda: f.read(4096), b""):
                hasher.update(chunk)
        content_hash = hasher.hexdigest()
        # side effect: updates the content_hash field in the model instance
        self.cleaned_data["content_hash"] = content_hash
        return file_upload

I’m struggling to find anyone causing themselves this specific problem so I’m not sure how to work around it.

I think I’ve resolved this problem by switching from

with self.file.open(‘fb’) as f:

to

f = self.file.open(‘rb’)

Hoping that won’t cause any problems. But its still not working -
The ‘content_hash’ field is not updated by performing this operation in clean_file() OR overriding the whole clean() function for the assigned ModelForm.
Unsure why, but maybe it is because of the field property for content_hash “editable=False.”

        sha1 = hashlib.sha1()
        with file.open('rb') as readhandle:
            for chunk in readhandle.chunks():
                sha1.update(chunk)
        return sha1.hexdigest()

This code works for me, but file is a Python File object. Since you access your file through cleaned_data, it might be that this file is either an UploadedFile or a InMemoryUploadedFile. In the latter case you won’t be able to open the file and iterate over its chunks.