I’m curious to gather opinions on the merits of signing cached data to mitigate the possibility of remote code execution attacks in the event of a cache compromise.
All of Django’s built-in cache backends (in-memory, memcached, Redis, database, and file-based) serialize cache data using pickle. In the event an attacker can write to a cache, the attacker could provide malicious cache data that would execute arbitrary code when unpickled by the Django application server.
This longstanding known issue has been reported on Trac (2024) and HackerOne (2021) and I’ve answered both by saying, “Of course there are going to be problems when servers are compromised. Django could sign the data it puts into the cache, but this probably has too many performance penalties for a cache.”
The Django docs warn:
An attacker who gains access to the cache file can not only falsify HTML content, which your site will trust, but also remotely execute arbitrary code, as the data is serialized using
pickle.
(This warning for the file-based cache backend is applicable to all backends.)
With that as background…
I’m part of the team developing a MongoDB backend for Django. It has its own cache backend (since Django’s built-in database cache backend is SQL-specific), and the MongoDB security team has flagged the use of pickle as a vulnerability needing remediation, proposing optional (enabled by default) HMAC signing of cache entries. I’ve pushed back, citing inconsistency with Django’s built-in cache backends and questioning whether the overhead of signing cached data will make the cache perform so poorly as to make it largely useless.
Regardless of the decision that MongoDB makes on this, do you see any merit to adding optional cache signing to Django’s built-in cache backends?
I feel the decision at MongoDB is largely based on liability concerns related to their hosted database offering (e.g. a breach of their Cloud database could lead to comprise of customer application servers), but it also leaves me with a strange impression that perhaps MongoDB is unusually vulnerable to compromise, such that additional protections are necessary. How do you feel about it?