Very slow migrations with large numbers of tables

Without getting into all the details model type instances (ModelBase instances AKA Model subclasses) cache forward and backward relationships in their _meta object (look at all the properties in there) in order to make introspection and ORM query resolving faster. This is a cost worth paying at project setup time as it happens once and then it remains immutable for the remaining of the Python process but it performs very poorly when model classes have to be generated over and over again as their are mutated between operations.

There have been years of investment in trying to build a smarter invalidation logic for dynamically built fake model classes to avoid doing unnecessary work while ensuring that the graph doesn’t become corrupt (e.g. reverse relationship still pointing at an old model reference). This problem is embodies perfectly why cache invalidation is said to be hard as it’s not only about making things faster but also about making sure the resulting changes do not corrupt relationships in non-trivial ways.

Both Markus Holtermann and myself spent a lot of time fiddling in this area came to the conclusion that the likely best way forward is to adapt the schema editor to operate from ModelState (which are lean tuples of field options) instead of BaseModel instances that require constant rending but that would require a large deprecation phase that requires careful planing to avoid leaving third party backends and other migration related third party apps in the dust.

Assuming we were able to achieve this goal BaseModel instances would only need to be rendered for RunPython operations which could be done on-demand when retrieving objects from apps.get_model.

I’m not saying this to discourage you from trying to make things faster, I know @shangxiao ventured in this area not too long ago following another migrations are slow thread. I just want to warn you that a lot of cycles were spent in this area already which ended up as promising speedups followed by regressions in complex migration graphs for a large projects in the wild that required reverts. Over the years this has reinforced my belief that moving away from creating disposable Python classes for the sole purpose of introspection was a saner approach.

1 Like