Programatically upload file to a Model

Hey,

I have a database with a bunch of files from a legacy app that needs to be uploaded to an existing django database (files are managed on a S3 bucket). Images related to a product. I have 4000 images, and for each image I have the corresponding product # in the postgres database (so I can set the fk_produit below):

class ImageProduit(models.Model):
	fk_produit = models.ForeignKey(Produit, on_delete=models.SET_DEFAULT,related_name="images", null=False, blank=False, default=0)
	image = models.ImageField(upload_to=imgprod_uploadto, verbose_name="Image Produit", blank=True, null=True, storage=MediaStorage())
	notes = models.TextField(verbose_name="Notes", null=True, blank=True)

What I would like to do would be something like (pseudo-code):

for file in os.listdir(path_to):
   img = Image.open(file)
   prod_id = get_prod_id(img)
   new_image_product = ImageProduit(fk_produit=prod_id, image = img)
   new_image_product.save()

I’d much prefer to go through the django ORM than, say, bulk upload to the S3 bucket there’s a filestructure that depends on the product and I have a post_save() signals. If I don’t go through the orm, then I’d have to write scripts for both the S3 buckets structure, plus manually manage the post_signal().

How would you go about this? Is that possible to pass some image object directly to the ImageField as-if it was obtained from the request file uploader?

Are you talking about running this on the server where both the files and Django exists, or are you looking to run this on a different machine that needs to copy the files to the server where Django is running?

The answer depends upon which situation you have.

If everything is local, then yes, you can write a management command using logic similar to what you describe.

If this is running on a local machine sending files to the server, then you could write a script using the Requests module to submit the files as form submissions.

1 Like

Check the docs on Using files in models [sorry for the bad link]

>>> from pathlib import Path
>>> from django.core.files import File
>>> path = Path('/some/external/specs.pdf')
>>> car = Car.objects.get(name='57 Chevy')
>>> with path.open(mode='rb') as f:
...     car.specs = File(f, name=path.name)
...     car.save()
1 Like

In fact I think my current configs allows me to ignore the distinction.

Basically, the files are local and to be uploaded on the remote S3 server (and objects saved to aws rds postgres production db). However, I do I have settings.py configured in such a way that I can both run django on localhost, but still remotely connected to the S3 bucket (and my aws rds database for the produciton app).

Hmmm okay - I was looking at the source code and the File reference docs, and all that it mentioned seemed to have been rather complex objects like InMemoryFileUploadHandler().

BUt I’ll look at that part of hte docs, and yes that is fairly close to what I’m looking for I think.

I’ll post a fuller answer if that works.

So, putting toghether all the bits & pieces, I came up with this:

from django.core.management.base import BaseCommand, CommandError
from django.core.files import File
from products.models import Produit
import os
from pathlib import Path
from django.core.files import File
from products.models import DocumentProduit

class Command(BaseCommand):
    def handle(self, *args, **options):
        try:
            with os.scandir(path="./ressources/doc_products/") as iterator:
                for x in iterator:                                              # foreach file
                    print(f'{x.path} ({x.name}) is_file={x.is_file()}')
                    opath = Path(x.path)
                    with opath.open(mode='rb') as ofile:
                        document = File(ofile, name=opath.name)
                        self.manage_product_and_docs(document, opath)
        except CommandError as ex:
            print(f"Command er: {ex}")

    def manage_product_and_docs(self, document, opath):
        docprod = DocumentProduit()
        try:
            id = self.get_noprod_from_filename(opath.name)
            prod = Produit.objects.get(id=id)
            docprod.notes = "migration - dbase"
            docprod.document = document
            docprod.fk_produit = prod
            docprod.save()
        except Exception as ex:
            print(f"Couldn't add document to product id {id} ")

    def get_noprod_from_filename(self, filename):
        noprod = filename.split(".")[0].split(sep="-")[0]
        return noprod

To run:

python manage.py import_documents

With the command above under /projectlibs/management/commands/import_documents.py

In this setup, the files live locally in my project structure. I’m passing as environment variables everything needed for the application to run locally, but connect to the remote db (postgres on aws’ rds in my case) as well as to the S3 bucket. Obviously details for this will be dependent on specifics of application structure. See Ken’s post for details on this.

For images, one can similarly proceed - importing django.core.files.images.ImageFile instead.