Tips for finding severe memory leak in django/db/utils.py?

I have a long running process in a django function. It uses an iterator to process 100,000 records.

The processing code does a number of queries, standard stuff, filtering, updating, creating new records etc…

The problem I’m having is that every time the loop runs the memory usage goes up by 4mb and eventually hits 64GB triggering the process to be killed. To put it in perspective my complete database is only 300mb.

Looking for any tips to debug this.

What I have done so far is:

  • Check the local and global variables - Combined are less than 1mb so don’t believe it is python objects.

I used tracemalloc to print the top 10 stats at the start of every loop and it seems that django/db/utils.py is increaseing by 4mb with every iteration.

Top 10 memory consuming lines:
/home/patrick/.local/share/virtualenvs/envision-oiVnV5sy/lib/python3.13/site-packages/django/db/utils.py:98: size=74.4 MiB, count=22598, average=3453 B
Top 10 memory consuming lines:
/home/patrick/.local/share/virtualenvs/envision-oiVnV5sy/lib/python3.13/site-packages/django/db/utils.py:98: size=78.1 MiB, count=22870, average=3581 B

I have tried adding the following to the start of each loop to try and clear some memory.

gc.collect()
cache.clear()
django.db.reset_queries()

I have also tried turning off caching (dummy backend).

debug is false.

Nothing seems to improve the situation.

Anything else I can try.

Thanks!

It’s going to be really difficult to try and identify a potential cause of the problem without seeing the code that is being run.

If you look at the source code for django.db.utils, line 98 is in the DatabaseErrorWrapper, which is created by the wrap_database_errors method in BaseDatabaseWrapper. So the root cause wouldn’t be in the wrapper, but in something being done to repeatedly call it.

Diagnosing this would pretty much require examining your code. Superficially, this implies to me that you are creating a lot of instances of this class because of something unusual that you are doing in this loop.

(We run a lot of persistent processes that are constantly working with data, and have never seen behavior like this.)

Thanks for your reply Ken, it’s definitely a tricky one.

There is about 2,000 lines if code in the processing part, I might just need to step through this line by line and try and pinpoint the problem!

Are you dynamically creating connections or altering connections within your loop? If so, that would be the first area I’d check.

Ok I found the problem and I’m surprised I haven’t come across this before.

I’m creating objects inside the loop and appending them to a list, at the end of the routine I’m using bulk_create to save all the objects. This approach speeds up the loop as bulk_create at the end is faster than saving individually.

Anyway, if I create a django object, for example

>>> p=Plan()

and then get the size of it.

>>> sys.getsizeof(p)
48

It’s 48 bytes, which seems reasonable.

If I add that object to a list and get the size of list it’s 88 bytes which also seems reasonable.

>>> plans = []
>>> plans.append(p)
>>> sys.getsizeof(plans)
88

The problem happens if I create 10,000 objects and add them to a list.

The size of the list in memory becomes 85KB (48 bytes per object) however the memory usage of the python process goes up by ~214MB (2,071 Bytes per object).

This is a simple example but in my case I’m creating lots of objects within each cycle and appending these to lists.

I haven’t studied the internals of python but now that I look at it again I’m assuming that the list is just storing a reference to the actual object in the django library, hence why I was seeing utils.py increase. It’s strange that my Plan() object is taking up 2KB in memory but anyway.

I can probably get around this by doing a bulk save when the lists reach a certain size.

Here is a full example, Plan() can be substituted for any django object

import os
import psutil
import sys

process = psutil.Process(os.getpid())

initial_memory = process.memory_info().rss

plans = []

for i in range(10000):
    plan = Plan()
    plans.append(plan)

final_memory = process.memory_info().rss

memory_increase = final_memory - initial_memory
memory_per_object_increase = memory_increase / 10000

print(f"Initial memory usage: {initial_memory} bytes")
print(f"Memory usage after creating 10,000 objects: {final_memory} bytes")
print(f"Memory increase: {memory_increase} bytes")
print(f"Memory per object increase: {memory_per_object_increase} bytes")

plans_size = sys.getsizeof(plans)
print(f"Memory used by the 'plans' list: {plans_size} bytes")

single_plan_size = sys.getsizeof(plans[0])
print(f"Memory used by a single Plan object: {single_plan_size} bytes")

I ran this on one of my test systems, using the User object instead of Plan:

Initial memory usage: 171741184 bytes
Memory usage after creating 10,000 objects: 174714880 bytes
Memory increase: 2973696 bytes
Memory per object increase: 297.3696 bytes
Memory used by the 'plans' list: 85176 bytes
Memory used by a single Plan object: 48 bytes

I even upped the size of the list to 1,000,000 objects:

Initial memory usage: 174714880 bytes
Memory usage after creating 1,000,000 objects: 470450176 bytes
Memory increase: 295735296 bytes
Memory per object increase: 295.735296 bytes
Memory used by the 'plans' list: 8448728 bytes
Memory used by a single Plan object: 48 bytes

This is showing a total memory increase of ~295 MB for 1,000,000 objects.

I do think there’s something else going on here.

Side note: There are effective upper limits on the number of objects being created in bulk_create. There’s a point beyond which the SQL traffic will be “chunked” anyway, so you might as well do these creates in batches.

Thanks for testing Ken.

I can confirm I get the same result with 1,000,000 User objects so the problem must be my models.

Memory usage after creating 1,000,000 objects: 533344256 bytes
Memory increase: 318767104 bytes
Memory per object increase: 318.767104 bytes
Memory used by the 'plans' list: 8448728 bytes
Memory used by a single Plan object: 48 bytes

I might just need to set up a worker that creates the object in the background, and feed it batches of 10,000. :thinking: