Idea to explore: Partial migration operations

Inspired by this DjangoCon US 2024 talk and some personal struggles with the topic of zero-downtime migrations, I’d like to explore the idea of “partial migration operations”. I will use the use-case of deleting a model field as example, maybe there are more applications. In any case, this post is just a playground to explore the idea.

In order to delete a model field with zero downtime, one must do a bit more than what python manage.py makemigrations does when removing the field from the model code: make the field nullable, remove the field from the state and finally remove the field from the database.

So here’s the idea. Imagine one could delete the field from the model and then do: python manage.py makemigrations --partial. This would generate a new RemoveFieldOperationPart1 (please forgive the name) operation in a migration. This migration would do a batch of safe operations to be done in a deployment (e.g. nullify and remove from state). After a deploy we could do python manage.py makemigrations --partial again and now the field does no longer exist but the state knows there is a RemoveFieldOperationPart1 so it would generate a RemoveFieldOperationPart2Final, which would finally remove the column from the table.

The --partial argument would be necessary to not be generating the part 2 until explicitly asked for (aka after a deployment). I could also imagine that if there are pending partials from multiple places, the makemigrations command could be interactive in order to choose which subsequent partials to include.

What do you think?

Oh well, maybe a duplicate of Let's talk zero-downtime migrations ? In any case, I know the topic has been abundantly discussed and different approaches for zero-downtime migrations explored+implemented. Not sure if this specific idea has been explored though. If this already rings a bell to you and leads to a deadend, please let me know.

Hello @lorinkoz, I do think this is duplicate discussion to the thread you’ve linked.

I believe the minimal changes that would be needed in core to achieve what you’re describing are demonstrated in this package. The primitive needed to make it work are

  1. Add the notion of stage for migrations to allow the framework to create distinct operation for each one (e.g pre and post deployments).
  2. Adjust the auto-detector (aka makemigrations) to take advantage of the notion of stage by producing migrations partitioning operations by it.
  3. Adjust the migration executor (aka migrate) to be able to run in either stage (pre and post deployments).

I think it would be better to continue the discussion over the other thread though to avoid fragmenting the discussion.

1 Like

Alright, thank you! I will cross reference and follow up on the other one!