Why YOLO model loading for every call and how can I solve?


I am working on a project that uses Django as an API.
I wrote two functions in views.py.
The first function is for the POST method.
This function, taking an image url, downloads and saves to the database.
And the second function is for the GET method.
This function, getting the image from db, segments with the YOLOv5 model, and returns the results.

But every call of the second function is loading the YOLOv5 model again.
Therefore, I take Cuda Out Of Memory the second or third call.
How can I load the model only once and use it every call?

Thanks for your help!

You don’t - at least not within Django itself.

If you want some form of memory-resident model, you’re looking at perhaps something like a custom management command that is always running and exchanges messages with Django as needed.

However you do it, you want to segment your architecture between the persistent models being run and the request/response cycle provided by Django.


I suppose you have your model loading code in the view, you need to move it out of the view to the imports part, so it would be execute once and the start up and will persist for successive requests.

Hi Ken,
Is there not any way to do this in Django?
I can use the YOLO model in Django, for live streaming and it works.
But in this case, I need the POST and GET method.
Is the problem the POST and GET methods?

Hi Daniel,
I tried this, I loaded the model in another file and import it, but it is still same.
It works when it runs at port, but when it runs at UWSGI, loading the model for every call.

Django is fundamentally built around the idea of the “request / response” cycle. Objects are created when the request is received, and disposed when the response is returned.

In a production-quality deployment of Django, you also have multiple processes running. There’s no such thing as “sharing an object between processes in memory”. Additionally, the process manager will, based upon circumstances, restart any individual process.

If you want “persistent entities” between requests in a production Django environment, you want them outside the Django process.

That’s why, for example, each of Celery and Celery Beat are run as separate processes.