Custom Caching for Django ForeignKey

kracekumar · October 17, 2021, 6:39am

class PublisherManager(models.Manager):
     def get(self, *args, **kwargs):
         ....

class Publisher(models.Model):
    name = models.CharField(max_length=200)
    
class Book(models.Model):
    publisher = models.ForeignKey(Publisher, on_delete=models.PROTECT)
    name = models.CharField(max_length=255)

There is a DRF serializer that returns the books and publisher associated with the book. There are thousands of books whereas only a hundred publishers. The data distribution is 1:1000 and hence it makes sense to store publisher objects in Python app cache after the first load till the next app restart.

How can I cache the publisher using lru_cache or any cache in Python Process?

Constraints and other info

I’m already using Django cache with redis backend for other purposes and I don’t think redis would give a significant boost(same network call + lookup).
Adding a custom manager to the Publisher model with get, filter , all methods doesn’t help. all method gets called while loading a foreign key but there is no SQL parameters in the manager instance.
The object lookup happens in the related description get.

Is there a way to change the behavior without writing a custom Field? Idea is to use cache for subsequent lookup for a period of time.

KenWhitesell · October 18, 2021, 12:27am

Have you done any actual benchmarking of the relative performance between the two methods?
Keep in mind that PostgreSQL will cache data if memory is available to do so. If you’ve got the space for a memory cache for that data, that could be memory used by PostgreSQL as well. And, allowing PostgreSQL to do the caching also means that it will maintain the relationships between the books and publishers (the indexes) in memory as well.
It might be worth doing a POC benchmark test to see what the relative improvements would be. Given the overhead of the serializer processes themselves, I’ve got the funny feeling that you’re not going to see the improvements you might think you’ll see.

kracekumar · October 24, 2021, 6:14pm

Thanks. Yes, Postgres will cache the data since it’s frequently used data.

I did a POC to find improvement. It saved 10% of the total time of the queries(the subset of queries going to Postgres from the foreignKey access). In the grand scheme of things, it was less. Nonetheless, it was useful to know how to do it.

The solution was to create a custom ForeignKey’s ManyToOneForwardReference(If I recollect it).

thanks for your input.

Topic		Replies	Views
GSOC 2023: Improving the databse cache backend Mentorship	0	299	April 1, 2023
Relying on QuerySet's result cache Using Django	4	874	January 22, 2020
ForeignKey Load TImes Using Django	3	483	December 14, 2021
GSoC '23 - Improving the Database Caching Backend Ideas Mentorship	5	502	May 28, 2023
Automatic prefetching of ForeignKeyFields Show & Tell	1	1268	April 6, 2020

Custom Caching for Django ForeignKey

Related Topics