GSoC 2024 Proposal Feedback: Improve Database Cache Backend

Hi everyone,

I have created a draft proposal for Google Summer of Code. Any feedback would mean a lot to me!
Pinging @adamchainz since most of the inspirations in this proposal were taken from him.

1 Like

Hi @hisham :wave: here are some thoughts

All tests pass for these two method changes.

Is it worth linking a draft PR with those changes to show these tests pass on all backends?
Have you check the code coverage of existing tests?

At times you share some benchmarks - that’s great! Have you investgated django-asv and whether we have existing official benchmarks for these (and if not maybe add them?).

1 Like

I wonder if there’s anything we can learn from Laravel’s database cache: Cache - Laravel 10.x - The PHP Framework For Web Artisans
Might be worth taking a look?

I think to strengthen the proposal:

  • communicate what you have researched (what are the best practises here, are there frameworks doing it differently, django packages, python packages)
  • what are the challenges and the fiddly bits and how should you overcome them
2 Likes

Hi @sarahboyce,

Thank you for your thoughtful feedback. I appreciate your suggestion about linking a draft PR to demonstrate the passing tests across all backends. Here’s the draft pull request you requested. From a quick glance test coverage already looks very good.

Regarding benchmarks, I have indeed looked into django-asv. We did not have benchmarks for cache so I used this benchmark as it was more easy to set this up for different databases. I also created a draft pull request adding these benchmarks at least for SQLite for django-asv.

I looked into Laravel’s database cache and these were some of my findings:

  • The way their database cache driver works was surprisingly very similar to Django’s database cache.
  • I also found out they were using their orm fully in their driver (I don’t at least plan on doing this for all places because it might be less performant than raw sql queries maybe this would really shine for places such as set_many where you could optimize for using bulk_create for sqlite and postgres but not for mysql, oracle or mariadb?)
  • They already integrate their cache tables into migration by default since v11 (maybe it can be done here too, creating a new contrib.cache app or placing migrations somewhere else?)
  • They also run queries differently in case of different databases such as here (Django’s cache philosophy also includes “A cache should be as fast as possible”, I think this could be best achieved by customizing for different databases)

To strengthen the proposal I’ve been looking into best practices, different approaches taken by other frameworks, and relevant Django and Python packages. I plan to include these findings in the updated proposal this upcoming weekend as I am a little busy right now due to mid term exams.

I would love to hear your thoughts. Also, if there are any other areas you think I should focus on, please let me know.

1 Like

Amazing, well done @hisham!

Really interesting to see the findings of looking into Laravel :grin:

Added some comments to django-asv PR but also good to know where the other benchmarks came from.

Another thing I would do is look at the other proposals and think how you would order/rank them. I can see at least 9 proposals on the forum. Last year we had 2 GSoC slots (not sure how many we’ll be granted this year). If you think there are 3 proposals as good or better than yours, then you need to strengthen the proposal and you can take inspiration from other (and past accepted) proposals for this.

1 Like

I added proof of concept to use single query in set_many to draft pull request. It shows how set_many can use single query instead of n queries for SQLite, PostgreSQL, MySQL and MariaDB which improves performance.