David and me (Felix) and are having 2 ideas of how to extend makemigrations.
We are both happy to contribute and issue a Pull Request! But we want to check upfront if our work is worth the hassle and the PRs would have a chance to make it into the codebase.
Both ideas require deep integration into makemigrations.py and are not isolatable.
Idea 1 - Lint Migrations
Please checkout this extension maintained by David:
Checkout this Blog Article for the purpose of it:
Now the idea is to integrate the linter into makemigrations itself, so that makemigrations would warn, if the migration that should be created is not backward-compatible.
A new setting could be introduced, which, if enabled, warns the user while executing makemigrations, that the migration that is about to be created is not backward compatible.
Idea 2 - Replace Migrations
We want to get rid of old migrations. However, squashing migrations is not nice to handle in big projects, because of unsquashable operations, circular dependencies and squashing already squashed migrations.
The idea is to extend makemigrations with a flag --replace-migrations. If active, django would:
ignore all existing migrations
create new initial migrations
add all existing migrations into the replaces-list of the new initial migrations
Let’s assume that our current version of the software that we distribute is 3.0 and we want to get rid of all migrations prior to 2.0 (we know that all servers are >2.0). The workflow for getting rid of migrations prior to 2.0 would be:
git checkout 2.0
create a new branch 2-0-delete-migrations
delete all existing migrations
commit
git checkout 2.0
create a new branch 2-0-replace-migrations
run ./manage.py makemigrations --replace-all [app1, app2, ...]
commit
checkout master branch
cherry-pick commit from 2-0-delete-migrations
cherry-pick commit from 2-0-replace-migrations
Result:
All migrations prior to 2.0 are gone
New replace migrations are generated that imiditiatly create the state from 2.0 without the need of running squashmigrations
Server Scenarios:
Database is prior 2.0 -> Migrations will not work
Database is after 2.0 -> Newly created replacement migrations will not run because all migrations they replace are already applied
Database is fresh -> Newly created replacement migrations will run.
I have the code for this ready. It helped us removing ~300 old migrations.
Please let us know if it’s worth to work on a PR
These both look great, but I’m not quite sure if they would benefit from being merged in - remember, being merged in means you now only get to ship changes once every 8 months, so if you wanted to make a small tweak, you have to wait. Both of these seem like they’d benefit more from active improvement, though replacing migrations, of the two, is the one we could maybe consider merging in.
That said, there’s a consideration we could either adopt the projects under the Django banner or recommend them in the docs as a half-step; I’ll have to do a more thorough read-through of the code and understand their scope before I know if that’s sensible, though.
Do you have a code link for the migration-replacing workflow?
Here is a draft for makemigrations.py that includes the changes for the second idea. Just search the code for replace_all to see all the non-invasive changes made:
Considering the workflow, I forgot to mention some details in the original post:
The flag only works, if specifying all app-lables explicitly, as otherwise, Django also tries to replace migrations from itself/dependencies.
After replacing all migrations (with the workflow I described above) one must also check for dependencies in newer migrations and replace all occurrences of the removed migrations manually with the replaced ones.
The workflow I described is just necessary, if the migrations contain circular dependencies. If they don’t then a much simpler workflow (without using git) is possible, as the old migrations could stay in the code. Also changing the dependencies would not be necessary. The reason for this is lays in the logic how parent/child dependencies are changed during the migration itself. If interested, I can elaborate.
About the first idea:
Please consider, that our intention is not to merge any parts of the existing project into Django. The idea is more that developers are warned, when the migration that is created would contain non-backwards compatible operations (e.g. Altering a Field, Removing a field, or adding a new non-null field). The complexity of that would be less then linting the migrations afterwards…
For the llinting - that’s a great project and would be a real saver for many situations.
For the replacing workflow - is it that different from squashing with elidable=True on all RunSQL / RunPython operations? I’ve normally seen projects accumulate some special SQL in RunSQL operations, for things that Django doesn’t (or didn’t) support, such as stored procedures, triggers, or check constraints (now supported). Encouraging a workflow that removes all such custom SQL without any consideration could be a bit of a footgun, at least leading to dev/prod mismatches. At least with squashing all operations are preserved unless explicitly marked as elidable. I admit squashing has a bunch of problems but I hope with some effort it can be “smoothed out” a bit.
Hello @adamchainz,
thanks for reviewing the ideas and great that you like the linting!
For the replacing workflow - is it that different from squashing with elidable=True on all RunSQL / RunPython operations?
I guess in theory, you are right and this is the case. However, in praxis and in large projects with a lot of migrations and circular dependencies, squashing becomes very, very hard because of problems described here.
But you are right. Replacing migrations when using RunSQL-operations to apply structural changes to the database might be dangerous and should be warned about. Datamigrations however could be skipped, if they are used to convert data to comply with changes made to the database. On fresh databases there is no data yet that needs to get converted… If datamigrations are used to seed the database with an initial state (like a fixture), then this would also be an issue.
I think in general that this replace all workflow can only be used along with documentation and with care and probably some manual work. However, at least for us it was a life-saver as squashing our >1000 migrations was simply not possible and with replacing migrations we finally found a way out of forever accumulating new migrations
@andrewgodwin no, it does not. The newly created migrations are basically fresh initial migrations, as if makemigrations was just called for the first time.
Sorry for braking the link - I needed to move a file to make the project installable.
Here is the new link:
Ah, that’s problematic then - we’d need to have a series of big warnings saying that if you had either of those kinds of operations in the migration that you’d essentially need to rewrite them into the new migrations before you can continue.
@andrewgodwin yes, as I have written already above, I’d expect that --replace-migrations can only be used along with documentation and with care and probably some manual work. I’d be very happy to write those documentation though and would take ownership around upcoming issues…