Hashed Only Manifest Static Files Storage

I love ManifestStaticFilesStorage. Actually, I’m really using the CompressedManifestStaticFilesStorage from whitenoise, but what I’m wanting to bring up isn’t about the compression.

Once the files are hashed, those become excellent candidates for being cached infinitely, because the contents will not change. This is great. Static files served this way are excellent candidates for running through CDNs as well, because we’re not worried about trying to bust those caches.

Unfortunately, when using the ManifestStaticFilesStorage, it also stores the original file in the manifest. That path will change contents as the static file it points to changes, so it is not suitable for aggressive caching in general.

What I’d like to be able to do is force that static files are only served via the hashed, stable name that won’t change. This will allow me to trust that there isn’t something out there that is trying to read my static file from the non-hashed filename, and having it changed out from under it during a deployment. It would help me sleep at night, and deploy faster. It would allow me to be confident that a CDN for my static files would always be serving the file I intend it to.

The only static file that I’ve currently thought of really wanting at a stable URL is my favicon. For that one, I actually think I’d prefer to serve that up as a regular view, rather than serving it as a static file that I have a redirect pointing to. It would make it rather more obvious, to me at least, that this is a dynamic URL that changes when I need it to, and needs to be separately considered as far as caching headers.

Is a static files storage that only serves the hashed manifest files a reasonable idea, or are there significant challenges I’ve not considered? Does it exist somewhere, or has there been a discussion on the topic that my searching has not uncovered?

1 Like

My understanding of ManifestStaticFilesStorage is that it works with the collectstatic command. When you run collectstatic, it only moves the “hashed-name” version of the file to the static directory. If you have your web server configured to serve static files from that collection point, those would be the only files available.

See the three bullet points in the requirements section of the docs for ManifestStaticFilesStorage.

Thank you for your insight!

From the docs for ManifestStaticFilesStorage:

A subclass of the StaticFilesStorage storage backend which stores the file names it handles by appending the MD5 hash of the file’s content to the filename. For example, the file css/styles.css would also be saved as css/styles.55e7cbb9ba48.css .

Note in particular that it says that the file “would also be saved as” the hashed filename. This is consistent with my experience. I deploy to Heroku, so if it were not collected, my static files, such as my favicon that I currently have served via the staticfiles app, would not be served, since the collection location is empty in my repository when Heroku builds my app.

What you have described is exactly what I want it to do. Your message seems to confirm to me that you may share the intuition that what I’ve suggested is a reasonable desire. Thank you.

1 Like

I was working on improving static file caching for a site today, and my Nginx config got a bit complex because I don’t want to cache the unprocessed source files with unhashed names:

    # This matches static files that include a hash in their file names.
    # (i.e., those processed by collecstatic). These files should be
    # cached essentially "forever".
    location ~* ^/static/(.*?[a-z0-9_-]+\.[a-f0-9]+\.[a-z]+)$ {
        expires 1y;
        add_header cache-control "public, immutable";
        alias /sites/example.com/current/static/$1;
    }

    # This matches static files that don't include a hash in their file
    # names. These should NOT be cached forever. This allows static
    # source files to be viewed in the browser without needing to know
    # the hashed file name.
    location /static {
        alias /sites/example.com/current/static;
    }

The regex for matching hashed file names seemed somewhat brittle, so I started thinking about other options, and it occurred to me that I don’t really need or want the unprocessed source files to be available in production.

I did a bit of searching, and it doesn’t look like ManifestStaticFilesStorage supports this, so I came up with the following (it removes the source files from the STATIC_ROOT destination during post-processing just in case they’re needed before that):

import os

from django.contrib.staticfiles.storage import ManifestStaticFilesStorage as Base


class ManifestStaticFilesStorage(Base):
    """Manifest storage that saves only files with hashed names.

    The purpose of this is to avoid deploying unprocessed static source
    files. Not only are these files typically not needed in production,
    their presence makes static location configuration (in Nginx,
    Apache, etc) more complex, especially with regard to caching.

    When using the base manifest storage, `collectstatic` first copies
    the unprocessed source files to the `STATIC_ROOT` directory (e.g.,
    `./static`). It then post-processes the source files, creating files
    with hashed file names and saves them to the `STATIC_ROOT`
    directory.

    During the post-processing phase, this storage system removes the
    source files that were copied to `STATIC_ROOT` (but keeps source
    files that weren't post-processed).

    """

    def post_process(self, *args, **kwargs):
        process = super().post_process(*args, **kwargs)
        for name, hashed_name, processed in process:
            yield name, hashed_name, processed
            if processed:
                os.remove(self.path(name))

With this in place, I was able to simplify my Nginx config to just:

    location /static {
        expires 1y;
        add_header cache-control "public, immutable";
        alias /sites/example.com/current/static;
    }

If you do any kind of dynamic static file lookup (probably not very common), you’ll have to take into account that the source files aren’t present and do something like this:

from django.conf import settings
from django.contrib.staticfiles.finders import find
from django.contrib.staticfiles.storage import staticfiles_storage


if settings.DEBUG:
    file_system_path = find(name)
else:
    hashed_name = staticfiles_storage.stored_name(name)
    file_system_path = staticfiles_storage.path(hashed_name)