I happened to discover that my database caching strategy that uses no expiration timeouts (everything is handled in our code), appears to be getting culled and I cannot figure out what’s causing it.
I happened to discover it because some newly loaded data revealed a ZeroDivisionError during an export operation. I had cleared caches before the export so the export itself was triggering cache creations, and I had a console print that showed the values as they were set in the cache table.
But when I started the export over again (in the django shell) from the start the next day, I was seeing that the cached values were being set again immediately! I tested all our caching code and could not reproduce the issue, so I started querying the size of the cache table periodically to monitor the growth. MAX_ENTRIES is set at 1,500,000, but I observed drops in the table size occasionally:
In [1]: from django.db import connections
...: from django.conf import settings
...:
...: cache_settings = settings.CACHES.get("default", {})
...:
...: table_name = cache_settings.get('LOCATION', 'django_cache')
...:
...: db_alias = cache_settings.get('DATABASE', 'default')
...:
...: connection = connections[db_alias]
...:
...: with connection.cursor() as cursor:
...: cursor.execute(f"SELECT COUNT(*) FROM {table_name}")
...: row = cursor.fetchone()
...: count = row[0]
...:
In [2]: count
Out[2]: 111269
...
In [6]: count
Out[6]: 107421
I saw it happen a few times (checking infrequently). At one point, it dropped down to 95K.
I’m using None for the timeouts when I set, so each cache table entry has an expirses value of Out[45]: ‘9999-12-31 23:59:59+00:00’.
This is the setting for CACHES:
In [1]: from django.conf import settings
In [2]: settings.CACHES
Out[2]:
{'default': {'BACKEND': 'django.core.cache.backends.db.DatabaseCache',
'LOCATION': 'tracebase_cache_table',
'TIMEOUT': None,
'OPTIONS': {'MAX_ENTRIES': 1500000},
'KEY_PREFIX': 'PROD'}}
This is how we’re setting the cache values:
cachekey = get_cache_key(rec, cache_func_name)
cache.set(cachekey, value, timeout=None, version=1)
if settings.DEBUG:
print(f"Setting cache {cachekey} to {value}")
This is how we’re retrieving cached values:
good_cache = True
uncached = object()
cachekey = get_cache_key(rec, cache_func_name)
result = cache.get(cachekey, uncached)
if result is uncached:
result = None
good_cache = False
And whenever we delete any caches, there’s a debug print, but I never see it during my export:
delete_keys = []
# For every cached property, delete the cache value
for cached_function in self.get_my_cached_method_names():
cache_key = get_cache_key(self, cached_function)
if settings.DEBUG:
print(f"Deleting cache {cache_key}")
delete_keys.append(cache_key)
if len(delete_keys) > 0:
cache.delete_many(delete_keys)
I’ve asked our DB admin to see if they applied a culling mechanism without my knowledge.
This is an example of the console output from my cache set method that clued me in that caches that should have already existed were being set again:
Setting cache PeakGroupLabel.802620.enrichment_fraction to 0.0
Setting cache PeakGroupLabel.802620.enrichment_abundance to 0.0
Setting cache PeakGroupLabel.802620.normalized_labeling to None
Setting cache PeakGroupLabel.802621.enrichment_fraction to 0.0144748856194831
Setting cache PeakGroupLabel.802621.enrichment_abundance to 27084.94942029584
Setting cache PeakGroupLabel.802621.normalized_labeling to None
Setting cache PeakGroupLabel.802622.enrichment_fraction to 0.20433205129860907
Setting cache PeakGroupLabel.802622.enrichment_abundance to 4704198.803759875
Setting cache PeakGroupLabel.802622.normalized_labeling to None
I’m running out of ideas to figure out why my caches aren’t persisting. Do you guys have any suggestions? Do you think that somehow culling is happening? How can I confirm that or rule it out?
We’re on django 4.2.27.