Idea: {make,squash}migrations --no-deps

Hi,

I’ve been thinking about the problem of large projects with many models and migrations, and circular cross-app dependencies. In such projects, squashmigrations typically fails, and in many cases, one cannot even “declare migration bankruptcy” – the practice where you just delete all migrations from the code and the DB and start afresh: you can’t make the initial migration for app A because it depends on (i.e. has a model with an FK to a model in) app B, which depends on app C, which depends app A.

This is a different take on a problem which was tackled in 2 Ideas: New additions to makemigrations – the idea there was a somewhat dangerous operation to replace a whole set of migrations with new initial ones.

I’m suggesting that we add a --no-deps flag to makemigrations and squashmigrations. In makemigrations, this will create the models without the relationships; in squashmigrations, I’m not entirely sure, but in general, try to get from the state before the first squashed migration to the state after the last squashed ones, again, without creating relationships and dependencies. The user will then be able to run makemigrations for each (or all) of the apps, to create the relationships.

This will support the case not handled in the previous suggestion, where the FK dependencies are circular, preventing the creation of the new migrations. squashmigrations can be more careful about the RunSQL and RunPython (and other) operations.

This is just an initial thought; comments (including pointing out why I’m wrong and this can’t work or shouldn’t be supported) welcome.

Actually, this works in the cases I have tried. Migrations picks one model and creates it without the FK field, creates the other models in turn, then adds the FK field to the original model.

1 Like

This had not been my experience, but last time I tried was a few years ago. So I went back to a project where I ran into this, which has a couple dozen apps and a few thousands of migrations, and tried again, and indeed, “migration bankruptcy” works much better than I remembered.

But squashmigrations only works on one app at a time, so telling it to ignore external dependencies while it does that may still be helpful IMO.

The big issue I see with “migration bankruptcy” is that it doesn’t include custom SQL or other operations from the deleted migrations files. That might mean failing to apply some specific changes not yet supported by Django, like custom data types.

If we could teach squashmigrations to work on mutliple apps, it could convert circular relationships into multiple migrations in one app. Then maybe we wouldn’t need --no-deps ?

The current implementation of squashmigrations is:

  • Find a list of all the migrations which should be squashed
  • Collect operations from all of them, essentially, into one big migration
  • Unless told otherwise, optimize this migration

What you’re suggesting is definitely better than --no-deps, but requires a complete rewrite. Current code isn’t even graceful about dependencies – as far as I can tell, it will gladly include more than one dependency on the same app, and has no capacity to break the squashed migrations to more than one (and it is possible for it to “know” that it needs to – e.g. if migration 0002 has

run_before=[('other_app', '0017')]

and migration 0003 has

depends=[
    ('our_app', '0002'),
    ('other_app', '0017'),
]

Then there will be need to be a break between migrations up to 0002 and migrations starting at 0003 – but squashmigrations doesn’t even look at that.

--no-deps, on the other hand, seems like it can be implemented within reasonable effort (ignore all deps except those of the first migration, ignore all operations to create or modify relation fields), and still be useful – and relatively safe, as it avoids the problems you mention (correctly) with bankruptcy.

1 Like

I previously had done something similar (if I understand the original request correctly), I wrote two scripts for squashing “safe” migrations (no dependencies to other apps, no data migrations via runSQL or runPytohn). Is this something you’re looking for?

Squashing: squash_migrations.py · GitHub
Then removing squashed migrations after they’ve been applied: remove_squashed_migrations.py · GitHub

If it is something along the line of what you were thinking, would it be something that could be contributed to Django? We though of this as a quick win, as it could often reduce the number of migrations significantly, but it’s not perfect.

1 Like

Sorry for the delayed reply – no, your squash_migrations.py is not what I’m looking for; IIUC, it squashes when it knows things are fine, otherwise leaves things for later. What I want is to ignore dependencies – do the squash even when dependencies should, technically, get in the way, with the assumption that the developer doing this knows what they’re doing and will make sure to fix the issues.

https://code.djangoproject.com/ticket/35508

Hey,

Thanks for the idea!

with the assumption that the developer doing this knows what they’re doing and will make sure to fix the issues.

This is a part that bothers me. This option wouldn’t just work for users, which is the preferred way of doing things in Django IMO, but rather requires a more advanced understanding of migrations.

I rather concur with @adamchainz, that making squashmigrations able to generate more than one file in order to resolve circular dependencies would be a sweet feature addition.
Of course, it’s more effort - but perhaps it’s more worth it compared to adding an option that is meant to be deleted :slight_smile:

1 Like

Migrations, in general, were never like that – makemigrations has always worked for the simple cases, but required editing when things get hairier. This is in the same department.

For us to say that the option is meant to be deleted, IMO, we need to have at least a clear plan for adding the replacement. As far as I can tell, the algorithms are not known yet, and even the requirements are not well-defined.

Further, I have a hunch that in some cases, --ignore-deps can solve problems that just separation to break cycles will not; I think one such example is an app being removed, where operations between the addition of an FK to it, and the removal of that FK, prevent these two operations from being reduced to nothing.

Interesting topic! I talked to some folk on the DjangoCon in Vigo about my django-migration-zero package/approach which is basically the idea of the “migration bankruptcy” (nice term!) because squashing is super hard AND has in my and a bunch of other people zero benefit if you control all the databases (mostly for applications), compared to a package for example.

So, bottomline: A big +1 for having a migration clean-up command that “just works” in the case that you don’t need a history…