Generic backup/restore command.

I’m wondering what solutions other folks use for backup and restoration of their Django sites?

Seems like there are two main paths to go down here:

  1. Use the functionality of your infrastructure - i.e. above Django on the stack.
  2. Use Django aware tooling - most likely a management command or series of management commands.

1 seems to be most appropriate for deployments on complicated infrastructure. For me, I don’t manage anything super duper complex so I’m more interested in 2. I also like to keep my sites as infrastructure agnostic as possible so I don’t get locked into any particular vendor.

I’ve been looking around the community and haven’t found an existing solution that satisfies all my needs so I’m considering rolling a new library. My needs being:

  1. Backup/restore from any Django storage implementation.
  2. Optionally encrypt backups w/ overridable/pluggable encryption strategies
  3. Backup all website/python state: databases (all flavors, using native dumps), python dependency versions, media files, and all other custom state. To satisfy the custom state requirement the command will need to have a plugin system.
  4. Restore from specific timepoints.
  5. Configurable backup retention strategy (optional, seems appropriate this could be handled external to the tool)
  6. Warnings and blocks configurable in settings on restore to avoid accidentally wiping a production system.
  7. Warnings when restoring to a snapshot made on differing python stack.
  8. A single command should backup everything, but certain artifacts can be elided from the backup based on CLI parameters. (e.g. ./manage.py backup or ./manage.py backup database media)
  9. Incremental backups - diffs to save file space. Not a deal breaker but would be awesome.
  10. Optional atomicity - i.e. lock the site during backup so disk state and db state are consistent. (not a dealbreaker but nice to have)

Aside from platform agnostic backups to avoid vendor lock-in my primary use case is to be able to pull production state into a local development environment easily.

These are the existing tools I’ve found that still seem to be alive:

  • django-dbbackup - this is closest tool to what I’m looking for, but theres no plugin system for custom state and each state artifact is treated separately.
  • django-pgclone - just for postgres databases
  • django-pg-copy - just for postgres databases

Am I missing anything?

It’d be nice if there were a generic CLI scaffold custom state backup/restore needs could be attached to so we didn’t have to separately reinvent the storage/encryption/snapshot bits. Do others agree?

It’s at least three separate “layers”:

  • Base operating system
  • “Django infrastructure”
  • Data

Personally, I use a mix of Ubuntu autoinstall and ansible for creating reproduceable builds of the operating system.

The Django project itself lives out on a git repo - any specific release or version can be pulled from it.

The database is backed up using (either) pg_dump or pg_dumpall.

I’ve never looked for anything resembling an “integrated” or “all-in-one” solution, because our needs for each of these three steps are different.

  • The autoinstall process is used to create servers for multiple different purposes. It’s not tied to just creating servers for Django projects.
  • Using the git repo for deployment is done for multiple projects, not all are Django.
  • I use PostgreSQL for more than Django - sometimes the data is related, and sometimes not. (Also, I frequently have entities within the database that are outside the realm of Django’s management - triggers, stored procedures, users, etc)

Having the flexibility to mix and match makes some tasks significantly easier, such as copying production data to a test system to validate upgrades, or testing a Django project on a new version of the operating system (such as Ubuntu 22.04 → 24.04)

1 Like