I am interested in this project, and was researching a bit about it. Primarily from the sources that were mentioned in the GSoC idea itself (SummerOfCode2023 – Django). Based on my reading of the Django docs that are present on caching methods, and based on the blog post written by Adam Johnson (cc @adamchainz ), I think I kind of have an understanding of what the current issues are, and what has to be worked upon. I’ll try to summarise what the issue with the default
DatabaseCache is, that has been explained very well in the blog post, and is currently in use in
django-mysql as well. (Link to the blog post: Building a better DatabaseCache for Django on MySQL - Adam Johnson)
Storage is done using
TEXTtype. This was probably due to a historic reason, as at that point in time probably different databases supported by Django didn’t have a corresponding binary data format. As far as I know, all databases currently do have some sort of binary saving format, which will definitely be useful for us.
Another issue that Adam mentions (and he has implemented a solution to in
django-mysqlis that the
SELECT COUNT(*)operation is quite an expensive one, and it is done on every
set()operation to check if the cache limit has been breached. The approach that Adam provides is a probabilistic approach, where a certain percentage of
set()operations are checked for this, overall improving the efficiency of the operation. However, this might lead to some cases where the limit might be breached.
I am currently going through the codebases of both Django and
django-mysql to understand better what the current implementations are, and what we can do as part of improving this.