Hashed Only Manifest Static Files Storage

I was working on improving static file caching for a site today, and my Nginx config got a bit complex because I don’t want to cache the unprocessed source files with unhashed names:

    # This matches static files that include a hash in their file names.
    # (i.e., those processed by collecstatic). These files should be
    # cached essentially "forever".
    location ~* ^/static/(.*?[a-z0-9_-]+\.[a-f0-9]+\.[a-z]+)$ {
        expires 1y;
        add_header cache-control "public, immutable";
        alias /sites/example.com/current/static/$1;
    }

    # This matches static files that don't include a hash in their file
    # names. These should NOT be cached forever. This allows static
    # source files to be viewed in the browser without needing to know
    # the hashed file name.
    location /static {
        alias /sites/example.com/current/static;
    }

The regex for matching hashed file names seemed somewhat brittle, so I started thinking about other options, and it occurred to me that I don’t really need or want the unprocessed source files to be available in production.

I did a bit of searching, and it doesn’t look like ManifestStaticFilesStorage supports this, so I came up with the following (it removes the source files from the STATIC_ROOT destination during post-processing just in case they’re needed before that):

import os

from django.contrib.staticfiles.storage import ManifestStaticFilesStorage as Base


class ManifestStaticFilesStorage(Base):
    """Manifest storage that saves only files with hashed names.

    The purpose of this is to avoid deploying unprocessed static source
    files. Not only are these files typically not needed in production,
    their presence makes static location configuration (in Nginx,
    Apache, etc) more complex, especially with regard to caching.

    When using the base manifest storage, `collectstatic` first copies
    the unprocessed source files to the `STATIC_ROOT` directory (e.g.,
    `./static`). It then post-processes the source files, creating files
    with hashed file names and saves them to the `STATIC_ROOT`
    directory.

    During the post-processing phase, this storage system removes the
    source files that were copied to `STATIC_ROOT` (but keeps source
    files that weren't post-processed).

    """

    def post_process(self, *args, **kwargs):
        process = super().post_process(*args, **kwargs)
        for name, hashed_name, processed in process:
            yield name, hashed_name, processed
            if processed:
                os.remove(self.path(name))

With this in place, I was able to simplify my Nginx config to just:

    location /static {
        expires 1y;
        add_header cache-control "public, immutable";
        alias /sites/example.com/current/static;
    }

If you do any kind of dynamic static file lookup (probably not very common), you’ll have to take into account that the source files aren’t present and do something like this:

from django.conf import settings
from django.contrib.staticfiles.finders import find
from django.contrib.staticfiles.storage import staticfiles_storage


if settings.DEBUG:
    file_system_path = find(name)
else:
    hashed_name = staticfiles_storage.stored_name(name)
    file_system_path = staticfiles_storage.path(hashed_name)