Storing uploads momentarily

I am trying to upload a file, store it long enough to use it in the next page.then dispose of it afterwards. I am wondering the best way to go about this. My experience with uploads is rudimentary. I am familiar with uploads associated with models. Since I don’t intend to keep the uploaded file, I don’t think having models for this one-off situation is mature. Perhaps I am overthinking the approach.

I have tried default_storage from django.core.files.storage. With default_storage, I am able to save the file but shortly after that, default_storage believes the file does not exist. I still see the file on the file browser though but default_storage just does not agree.

file = self.request.FILES.get('file')
path = f"tmp/{self.request.user}/somename.ext"
default_storage.save(path, file)
print(default_storage.exists(path))
False

Why? the files in there on the system.
I am open to ideas.

Without knowing precisely what the type of file is that you’re dealing with, it’s size, or the type of processing required for that file, it’s really tough to offer any real tangible solution.

Having said that, my gut reaction to this is to suggest not using a FileField like you would normally use to hold uploaded files, but to create a model to temporarily hold the data from the file. That’s potentially going to make it easier to use it in the subsequent view as well as making it very easy to dispose the data once that view has finished.

The files are mainly Office documents (docx, xls, odt, ods, txt) for data extraction purposes only. They may contain images, so I expect the documents of sizes between a few Kb and 90MB.

I assume you mean not using FileField via Forms alone. Otherwise models also use FileField for handling file uploads all the same. The database keeps a reference to the file while the data recides on disk somewhere. Yes, it seems easier to have models for such things. I just did not think the frequency of use justifies that approach.

Yes to a FileField in the form.

No to a FileField in the model.

Instead, create a data field - it could be just a BinaryField in the model that stores all the data from the file.

It’s a lot easier to manage the data that way than any other way I can think of - particularly if there’s any chance that two different people might be uploading a file at the same time.

huh?

Huh???

Now I am confused. Could you explain further?

I would create a form, with a FileField in the form to allow the user to upload the file.

However, the model in which I’m going to store the data would not have a FileField in the model. It would have a BinaryField to store the data that was uploaded. (Remember, a model FileField (effectively) stores the file name of the uploaded file, not the data itself.)

So then the second view would process the data from the model’s BinaryField as if it were reading it from a file - and then delete it as appropriate.

Okay , now I understand. Thank you very much

I am observing a strange phenomenon here. It appears model objects with BinaryField attributes do not follow certain design rules. I am probably missing something. My observation is that models with BinaryField columns write to the database despite save(commit=False). This thing almost drove me nuts yesterday. Check this out

    def form_valid(self, form):
        """If the form is valid, save the associated model."""
        
        self.object = form.save(commit=False)
        self.object.file = self.request.FILES.get("file").read()
        self.object.save()
        return super().form_valid(form)

I find that something ends up in the database even when I remove the line (self.object.save()). Something does … whatever that is, but I get an error message saying that something is already there for the user (the user column has a unique constraint … I don’t want multiple files from each user). What is the design? Is it that Django creates an dud/empty entry in anticipation for a transaction? Otherwise, something is messing with the model.save()method even when commit=False

I’d need to see the complete view along with the models involved.

I had a unique constraint on the userid attribute of the model (a OneToOneField). OneToOneFields have in-built unique constraint. I suspect the additional layer was messing with Django. That portion works for now. The current challenge is reading the uploaded file into a method that expects a file-like object or a path.

tempfile = TempFile.objects.filter(userid=self.request.user.id)
if tempfile.exists():
    parse.document(tempfile.first().file)

Output:
AttributeError: 'memoryview' object has no attribute 'seek'

Found a solution to this

from io import BytesIO

tempfile = TempFile.objects.filter(userid=self.request.user.id)
if tempfile.exists():
    fileobject = BytesIO(tempfile.first().file.tobytes())
    parse.document(fileobject)

BytesIO reads the byte file that came from the database and returns a file-like object (which is perfect for the situation). I am curious. Where does the Django documentation cover stuff like this? I got hints from here and here.

This is one of those topics where you would need to read between the lines.

The basic documentation for a FieldFile talks about this a little bit.

The object wrapped by the class is not necessarily a wrapper around Python’s built-in file object. Instead, it is a wrapper around the result of the Storage.open() method, which may be a File object, or it may be a custom storage’s implementation of the File API.

So this is the only real clue that you’re not working with a File, and that seems to me to be the key point to remember when you’re working with a FieldFile.

You would then need to know that the parse.document function requires a File - or at least something that supports a seek function. From there, you either wrap the FieldFile object like you did, or find a different function to process the file.

1 Like