I want to use the multiprocessing module in django and change the startup mode of the process from the default fork to spawn (the reason for doing so can be referred to Why your multiprocessing Pool is stuck (it’s full of sharks!) )However, when executing tasks involving Django’s ORM, errors may occur. How can we solve this problem?
django 4.1.4
simple example
import multiprocessing
from teleapps.host import models
def split_data():
multiprocessing.set_start_method("spawn", force=True)
processes = []
for i in range(4):
p = multiprocessing.Process(target=worker_function, args=(i,))
processes.append(p)
p.start()
for p in processes:
p.join()
print("All workers have finished.")
return []
def worker_function(task_id):
models.BaseServer.objects.filter(id=task_id)
print(f"Worker {task_id} is starting.")
error
Traceback (most recent call last):
File “”, line 1, in
File “/usr/local/python3/lib/python3.9/multiprocessing/spawn.py”, line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File “/usr/local/python3/lib/python3.9/multiprocessing/spawn.py”, line 126, in _main
self = reduction.pickle.load(from_parent)
File “/appdata/mainProject/server/teleapps/utils/multiprocessing.py”, line 2, in
from teleapps.host import models
File “/appdata/mainProject/server/teleapps/host/models.py”, line 4, in
class BaseServer(models.Model):
File “/appdata/mainProject/server/venv/lib/python3.9/site-packages/django/db/models/base.py”, line 127, in new
app_config = apps.get_containing_app_config(module)
File “/appdata/mainProject/server/venv/lib/python3.9/site-packages/django/apps/registry.py”, line 260, in get_containing_app_config
self.check_apps_ready()
File “/appdata/mainProject/server/venv/lib/python3.9/site-packages/django/apps/registry.py”, line 138, in check_apps_ready
raise AppRegistryNotReady(“Apps aren’t loaded yet.”)
django.core.exceptions.AppRegistryNotReady: Apps aren’t loaded yet.
Hello there, before you run any scripts outside the request/response cycle you need django to be configured.
You can accomplish that by placing this on the top of your script:
import os
import django
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "yourapp.settings")
django.setup()
# the rest of your script goes here,
# do not import any modules that may import models before the django.setup
This is documented here.
Another thing, using multiprocessing with django can lead to unexpected behavior, as you already know, specially on long running processes, so expect to have issues.
What process? Are you talking about runserver? If so, why? You shouldn’t be using runserver in anything appearing to be a production environment. If you’re running uWSGI, it has its own process manager.
(I’m not sure what gunicorn does.)
Or are you talking about spawning off your own processes from within Django? If so, that’s a mistake of a whole different nature.
Remember, your Django project does not own the process in which you’re running, and is subject to being terminated at just about any time.
Thank you for your reply
I have tried this method before, but it doesn’t work because I am currently starting the service through runserver. Therefore, this test file has already been loaded with django. If django. setup() is added, an error will be reported
File “/appdata/mainProject/server/teleapps/utils/scheduler.py”, line 3, in
from teleapps.scheduler.main_func.exec_host_task import main_func
File “/appdata/mainProject/server/teleapps/scheduler/main_func/exec_host_task.py”, line 6, in
from teleapps.host import utils
File “/appdata/mainProject/server/teleapps/host/utils.py”, line 3, in
from teleapps.host.batch_view import BatchsView
File “/appdata/mainProject/server/teleapps/host/batch_view.py”, line 10, in
from teleapps.utils.batch import BatchTask
File “/appdata/mainProject/server/teleapps/utils/batch.py”, line 4, in
from teleapps.utils.multiprocessing import split_data
File “/appdata/mainProject/server/teleapps/utils/multiprocessing.py”, line 7, in
django.setup()
File “/appdata/mainProject/server/venv/lib/python3.9/site-packages/django/init.py”, line 24, in setup
apps.populate(settings.INSTALLED_APPS)
File “/appdata/mainProject/server/venv/lib/python3.9/site-packages/django/apps/registry.py”, line 83, in populate
raise RuntimeError(“populate() isn’t reentrant”)
RuntimeError: populate() isn’t reentrant
Thank you for your reply
At present, I do start the service through the runserver method, and I can also start it through the uwsgi method, because it involves the problem of the apscheduler library being executed repeatedly in multiple processes in uwsgi. I will make improvements later. I will first use the uwsgi single process to start and try again, but it is a bit different from the problem I mentioned before. My example is a simple query (or any operation that needs to load the django environment) and an error occurs,This problem is easy to reproduce