possible speed up for cacheing (minor speed up)

amirreza8002 · May 25, 2025, 7:39pm

hi
so i’ve been exploring a small change in the caching framework which would, hopefully, result in a speedup in caching

tho i admit i’m not a master of benchmarking and i only run a simple test
and the difference is very small, but it’s there

you can find a small example at GitHub - amirreza8002/django_cache_test
my proposal is implemented in django_cache_test/django_project/settings.py at main · amirreza8002/django_cache_test · GitHub

where instead of constant use of getattr, i move all the methods to the proxy class via a loop and setattr
and since the cache object is only instantiated once, the overhead of a loop shouldn’t effect projects

adamchainz · May 26, 2025, 10:54am

Where are your benchmarking results? I would expect this to only make nanoseconds of difference.

amirreza8002 · May 26, 2025, 11:09am

hello
yup
it’s around 5000 or 6000 nano seconds per call (with the memory backend)

but it’s easy to do and cache is used frequently, so i thought it might be worth mentioning

this is the result of one test run

tests/test_performance.py 
running 10000 rounds of cache calls for stable cache

cache times:
total time: 137648160 ns
avg time:   13764.816 ns per call
.

running 10000 rounds of cache calls for experimental cache

experimental cache times:
total time: 78449430 ns
avg time:   7844.943 ns per call

diff: 5919.8730000000005 ns per call on avg
.

KenWhitesell · May 26, 2025, 12:41pm

5000 nanoseconds is 5 microseconds, or 0.000005 seconds.

It would be interesting to see how much time the setup takes, to figure out how many cache requests would be needed to make up the difference.

Remember that worker processes don’t run “forever”. They do restart periodically, which means that this “setup cost” is incurred on every process (re-)start.

My gut-reaction to this is that this is a “micro-optimization” with no real-world benefit - especially when compared to the rest of the time spent handling the request.

amirreza8002 · May 26, 2025, 2:08pm

you are correct that people probably won’t feel the change, unless they have a lot of requests

i should also correct myself, the diff that the benchmark shows is not for one call, it’s for three calls (set, get, del)

about the setup time, it takes about a thousnd cache calls to make up for the setup overhead (for the memory backend)
i’m not sure what’s the avg cache call per process lifetime, so can’t judge if this is good or bad

i also think that this could show itself more with cache libraries that are slower, such as django-redis and django-valkey

but i think perhaps this post working as an informative post for people who are looking to optimize will be enough, not sure if this would benefit django’s source code

KenWhitesell · May 26, 2025, 2:39pm

That’s not quite what I’m saying.

What I’m saying is that people won’t feel the change, regardless. The theoretical time saved is below the threshold of measurement error, much less than what anyone is ever going to see.

Why do you think this would make a difference? You’re not altering the processing time of the caches themselves.

I disagree, I see this as a “false optimization”. Saving nanoseconds in operations measured in milliseconds is insignificant to the point of meaninglessness. I believe people’s time would be better spent looking for improvements where it’s going to make a difference.

For me to consider this of having any potential value at all, you’d have to demonstrate that it provides more than a 0.1% response improvement for real requests being serviced. (Reducing a 100ms response to 99.995ms doesn’t cut it.)

amirreza8002 · May 26, 2025, 5:14pm

what you are saying is correct
I’m not disagreeing, I’d say I’ve learned a few points here, so thanks for that

nor it was my intention to say we should do this to optimize applications

the reason i shared this is because it’s easy and safe to do
so in my mind giving the CPU less work to do when it’s easy is something noteworthy.

i don’t agree that it’s false optimization, yes it’s not a good way to optimize a slow application, and one shouldn’t think of it for that, but it does optimize, and can be a place to play around and experiment, specially since the main point of cache is speed

i don’t have a benchmark on me, but what i remember from the last time u tested is that it was a bigger difference there, although still a small number
I’ll share the result if what i remember is correct

antoinehumbert · May 26, 2025, 8:34pm

Hi, I’m really not sure this is safe to do. As per your implementation, cache in instantiated only once, resulting in this instance being shared between threads, which is not the case with the proxy implementation from Django.

This means you can face problems with non thread-safe backends.

amirreza8002 · May 26, 2025, 8:51pm

hello @antoinehumbert
can you explain what you mean?
from what i can see cache instatiation is the same in both implementations, I’ve just changed the internal part if the class

antoinehumbert · May 26, 2025, 9:16pm

In django, setting the cache global variable at django/django/core/cache/__init__.py at main · django/django · GitHub instantiate a ConnexionProxy (defined here django/django/utils/connection.py at main · django/django · GitHub). This instanciation does not actually create a new connection. The creation of the real cache backend instance is delayed on first call to a cache method (which will trigger CacheHandler.__getitem__- django/django/utils/connection.py at main · django/django · GitHub and create a cache backend instance for the current thread), leading to a different cache backend instance per thread.

In your case, as you retrieve the cache backend instance in the constructor of your proxy, it will be instantiated in the main thread. After that, when using cache wherever in the application, CacheHandler.__getitem__ won’t be called again so you will use in every thread the smae cache instance that was created in the main thread. Sharing non-thread safe backend instance between different threads will lead to errors.

I think your optimisation just anihilate the reason why ConnectionProxy exists which is ensuring thread safety of cache backends

antoinehumbert · May 26, 2025, 9:36pm

Given the implementation of NewConnectionProxy which instantiate an empty object and transfer all attributes from the underlying caches[alias] instance to that object,

cache = NewConnectionProxy(caches, DEFAULT_CACHE_ALIAS)
is roughly equivalent to
cache = caches[DEFAULT_CACHE_ALIAS]

… if it was so simple, ConnectionProxy would not exist.

Topic		Replies	Views
Flaky cache tests Django Internals	4	488	January 20, 2024
GSoC 2024 Proposal Feedback: Improve Database Cache Backend Mentorship	5	520	March 30, 2024
GSOC 2023: Improving the databse cache backend Mentorship	0	367	April 1, 2023
GSoC '23 - Improving the Database Caching Backend Ideas Mentorship	5	630	May 28, 2023
Apparent inexplicable database cache culling Mystery Errors	6	42	January 30, 2026

possible speed up for cacheing (minor speed up)

Related topics