Relying on QuerySet's result cache

k_sze · January 21, 2020, 8:05am

Is it a bad idea to rely on Django keeping model object instances alive in the QuerySet’s _result_cache after a first iteration over the QuerySet?

I have a team mate who wrote some code to the effect of:

from foo.models import Foo
from foo.utils import patch_foo_with_name, do_something_with_foo_names

foos = Foo.objects.filter(bar="bla")

name = "Luke"

for foo in foos:
    patch_foo_with_name(foo, name)

do_something_with_foo_names(foos)

Note that this example is extremely simplified. The stuff that is actually patched into each Foo object is actually the result of some really complex logic that cannot be expressed as QuerySet.annotate. do_something_with_foo_names() will iterate over the foos and expects the patched data to be there.

Is this dangerous? I feel that the QuerySet’s cache is an implementation detail of Django that cannot be relied upon. I don’t think there is any explicit guarantee in the documentation that says the cache (QuerySet._result_cache) will never drop objects from RAM, although that’s how it looks in the Django 1.11 source code. A future version of Django might decide to limit the size of the cache and objects can be dropped from the cache, right?

I already know that this approach is broken if the QuerySet is paginated or further refined - e.g. foos = foos.filter(age__gt=3) - before calling do_something_with_foo_names, but let’s ignore that case for the moment.

adamchainz · January 21, 2020, 8:56am

It’s a great idea to rely on the query set result cache. Far from an implementation detail, it’s documented behaviour that allows you to avoid refetching data from your database. If there’s any wording that could be improved on the docs here please feel free to make a PR and tag me on it

k_sze · January 22, 2020, 2:58am

I understand that the purpose of the cache is to avoid refetching data from the database.

My concern is that the purpose of the cache might not include persisting, across iterations, extra things that I patched onto the objects.

Without digging into the Django source code, I would not know how Django avoids refetching data. What if Django were caching the database results in some data structure that is more compact than the full-fledged model objects, and re-instantiating the model objects every time the queryset is iterated over? I think that’s certainly a possibility, depending on how the Django core developers view the performance/memory usage trade-off. I hope I’m making sense.

adamchainz · January 22, 2020, 8:37am

Django will never change the representation of models in that way - it would break too many things, including built-in features like prefetch_related and annotations. Re-instantiating models can even have side effects due to pre/post init signals.

The result cache is implemented as a simple list of model instances and I can assure you it won’t change.

k_sze · January 22, 2020, 8:40am

I see. Thanks for the clarification!

Topic		Replies	Views
modelname_set.all() returning cached state Using Django	3	2645	July 15, 2021
Django Query Cache - same Data after DB Update until restarting Server Using the ORM	9	1590	March 14, 2023
Help with caching related objects Using the ORM	2	183	June 20, 2024
how pulling a queryset from a view works Using the ORM	3	531	May 25, 2022
How to cache queryset results in ListAPIView when pagination, filtering, ordering enabled? Forms & APIs	2	1295	July 14, 2022

Relying on QuerySet's result cache

Related topics