[GSOC 2023] Allow moving a model between apps.

Hi all!
My proposal for GSoC’23 has been accepted! Thankyou @shaib for providing your valuable feedback. I will be working on the issue #24686 (Support for Moving a model between two Django apps) – Django for the next three months.

I’m creating this topic to update everyone with my progress and get regular feedbacks for my work from the Django community.

Discussion: PR
Proposal: "Allow moving a model between apps" -- proposal for Google Summer of Code 2023. · GitHub

Really excited to implement this long awaited feature! :rocket:

Thanks,
Bhuvnesh (aka devilsautumn)

4 Likes

Good luck @DevilsAutumn !

1 Like

Hi all,
I have a created a WIP PR with modified migration operations + some related tests. :point_down:

Would be happy to get the feedback from the community and thoughts about the approach I have taken.

In the meantime, I also wrote a blog on my opensource journey with Django software foundation and some tips to prepare for google summer of code! :rocket:

Thanks,
Bhuvnesh

1 Like

We shortly discussed out current approach with @charettes and we agreed that it’s quite “heavy”. In a real production environment renaming database tables, constraints, indexes etc. will cause huge lock and long downtime. This is unacceptable for most apps. There are also some tricky scenarios that can be hard to handle gracefully. We’re thinking of an alternative minimal approach without renaming the database objects:

  1. Move a model (from app A) to a different app (B) and change all related columns to a model in app B
  2. Run makemigrations
  3. Auto-detector will ask user to confirm that the model name has been changed:
  • If user confirms and db_table is not set in a new model (app B), then the questioner will display a request to set it with the previous name, e.g. app_a_some_table and exit.

  • If user confirms and db_table is set, generate appropriate operations (all database no-ops):

    • delete model in app A (only from state)
    • create model in app B (only in state)
    • rename relations (only in state)

This should allow us to do this incrementally and perhaps improve in the future. We’re aware that it can cause conflicts when folks try to re-add a model with the same name in an old app. For now, we can document that underlying database columns/tables/constraints are not renamed, so creating a model with the “old” name in the source app will cause conflicts. What do you think?

Best,
Mariusz

I think the locks from such renaming should be compared to other operations, like changing the data type of a column (which is supported). In both cases, a heavy lock is required, and for the case where that is an issue, strategies of mitigation are available (e.g. first do copying on the database, then do lighter model-management with DB no-ops).

In order for the strategy you’re suggesting to work, if I’m not mistaken, a change must be applied to the model definition itself – to add the db_table in the model’s Meta. I suggest that we tell users to do this in order to avoid locks related to renaming, if they need to – and make sure we support this, but not try to do it unless explicitly asked to.

The question boils down to, what should be the default behavior. I think the default should be adapted to the small-to-medium case, and to try to keep the history of the model out of its definition unless there is a real need for it.

I may be missing some finer details – I don’t see them in the PR discussion, but if you have a pointer I’d be happy to take a look.

2 Likes

I may be missing some finer details – I don’t see them in the PR discussion, but if you have a pointer I’d be happy to take a look.

Sorry about that, it was a late discussion with Mariusz through Discord DMs where we discussed a somewhat related ticket.

I suggest that we tell users to do this in order to avoid locks related to renaming, if they need to – and make sure we support this, but not try to do it unless explicitly asked to.

I agree. As long as the auto-detection logic is smart enough to not issue renames if the moved definition of the model includes a Meta.db_table referencing the original name I think we should be fine. That’s the approach we’ve taken when a user renames a model while setting its db_table to the one generated with its previous model name or does the equivalent with fields.

As for the larger problem of model renames that don’t currently rename associated objects such as constraints and indexes that’s a larger problem that I wouldn’t scope to this project. It’s a problem that exist for RenameModel operations that don’t involve moving model between apps and is already scoped to a ticket. I would avoid trying to solve this complex problem during this project and focus on the model moving aspect instead.

I believe we’re very close to a point where we don’t need to introduce new operations at all to support model moving. By leveraging model management toggling we already have all the building blocks available to us to make the auto-detector generate a sequence of operations that allows for model moving. Making sure that it supports doing so without causing any database changes (avoid table renames, constraints re-creation) is sugar on top that could be handled in a follow up MR.

1 Like

I gave it a thought and the solution seems feasible - squash migrations, applying on empty and existing db all should work fine. But i’m unable to absorb the idea of not renaming the table. :sweat_smile:
I mean that the true essence of “moving” a model is being influenced. Actually moving a model should rename the table too. I understand there are some complications with renaming of a model due to objects associated with it ,but in first thought of moving model what comes in my mind is renaming of tables. I’m OK with both approaches since you all are much more experienced than me.

I also had a look at a tricky scenario mentioned by @felixxm and i think that moving a model should be treated as a one compact operations and there should not be any interdependency involved(in some case it might work), since there is a lot of complexity/dependencies involved - in both approaches.

I was curious about one thing - if the problem of renaming associated objects (ticket #23577) is solved , would the current approach(#16905) be feasible or better than setting db_table in model’s Meta?

I am real production projects (except the beginning of their life) renaming tables with data is extremely rare as it causes a lot of complication and it’s not worth doing. Moving models between apps is a different thing, apps can become obsolete, or some models can become used only by a different app, etc. In these cases. we’d like to move models between apps, but we don’t really care about the underlying database table.

It’s not solved, this ticket has been open for 9 years, without any PR proposal.

I gave it a thought and the solution seems feasible - squash migrations, applying on empty and existing db all should work fine.

yeah in essence it’s very similar to what you were planing to do with CreateModelInAppState and DeleteModelInAppState but it reuses existing primitives. I believe it also makes it more explicit to support the case where already unmanaged models are moved between apps.

In real production projects (except the beginning of their life) renaming tables with data is extremely rare as it causes a lot of complication and it’s not worth doing. Moving models between apps is a different thing, apps can become obsolete, or some models can become used only by a different app, etc. In these cases. we’d like to move models between apps, but we don’t really care about the underlying database table.

I agree but I also think that it’s okay if the outcome of this project doesn’t immediately address this issue as long as allows for it to be implemented down the road.

What I mean by is that is that I believe we should avoid introducing new operations and specialized x-app dependency rules for this problem if the existing primitives already allow us to solve this problem as it’d be easy to adapt the auto-detector not to issue a RenameModel in this case. By sticking to existing primitives (rename, create, managed toggling) and migration dependency definition features we avoid complexifing operation optimization and migration planning logic which would undoubtedly be the source of bugs.

It’s not solved, this ticket has been open for 9 years, without any PR proposal.

It’s quite hard to solve at this point given the hashing has changed a few times since the migration framework was introduced and some databases still have the constraint and index name inherited from migrating away from South. If we had been consistent it would have been as simple as issuing a few renames, at this point it requires quite a bit of introspection to figure out. It would be nice to solve though as consistent naming would allow commands such as sqlmigrate to be stateless in the sense that they wouldn’t have to reach out to the database to produce the exact SQL they would emit which has also been a requested feature.

1 Like

Great then ,I am starting to code the new approach. :+1:

Hi, the work is almost complete, but i’m stuck at one thing - how should i make the CreateModel ,DeleteModel and AlterField operations no-ops?

By creating the model un-managed and marking the model un-managed before deleting it. As for making AlterField noops that would require changes at the schema editor level.

2 Likes

Wrote a blog sharing my experience till mid-term evaluation. :tada:

1 Like

I was testing this patch for different use cases and I came across a scenario with potential error:

Consider 3 models ,A, B and C. Model C has fk to A and B, and B has m2m field to A through C. We move model B to app_two first and then move model C to app_two. For moving model C we have to change the through of model B to test_two.C and hence it should have an AlterField operation which is missing.

If we only move Model C to app_two and not B , AlterField is added for the m2m field as through is changed.
Will update the PR as soon as it is solved.

Hi all, I have been testing my patch #16905 which allow moving a model between apps for the last few days. I have tried the following cases:

  1. moving a model with m2m and fks – passed
  2. moving a model with through model and referenced by through model – passed
  3. moving a model which inherits an abstract model – passed
  4. moving a model which inherits a non-abstract model – passed
  5. moving a model with generic foreign keys – passed
  6. moving a model with indexes not defined by model – passed
  7. moving a model with triggers – passed

*moving multiple models together is not yet working as it involves creating complex dependencies with multiple migration files - looking into this .

Everything seems to be working fine till now. I would really appreciate the feedback from the community and also please let me know if there are more cases for which i should test my patch.

Thanks!

1 Like

Hi,
Recently I was thinking about @shaib comment on PR which was:

In general, the use of managed here feels wrong, even if the code is simpler in the implementation and shorter in the generated migration – because it introduces an unrelated concept into the generated migration code.

And I came up with some modifications - we can combine the db_table approach along with CreateModelInAppState and DeleteModelInAppState sub operations. This will remove the need to alter model options and also simplify overall moving operation. Only one migration per app will be created in this case - one in old app , another one in new app.

I understand @charettes’s concern to avoid introducing new operations but i strongly believe its worth adding in this case.

My main concerns with the only two migration approach (send & receive) is that there is no way to ensure that no other migrations are run in-between them.

This means that any migration added in a separate app for a model that references the moved one will have to add manual dependencies in the future to denote when it should run (before or after the move) to prevent interlacing risks.

We do some manual re-pointing of relationships in RenameModel but it’s an atomic sequence of state alterations that cannot have arbitrary migration / operation code run in between.

If we want to take this approach I think we still need to respect these axioms

  • At any in-between operation point there is at most one managed model pointing to the moved model table
  • At any in-between operation point there is at least one managed model pointing to the moved model table
  • At any in-between operation point all references to the moved model should be resolvable (no dangling references)

If we believe the generated series of operation is ugly or intimating I’m not against adding new façade operation that are composite of already existing ones as these already have well defined reduction rules but I think we should not take this decision lightly as once these operations are out there we must support them in their current form unless we go through painful deprecation.

My implementation suggestion was based on the above axioms and the fact that using existing time proven primitive operations felt less risky than introducing new operations that we must commit to support in their proposed form.

2 Likes

Thanks @charettes for the explanation.
I have been digging in the process of generating dependencies to solve the problem with generating the last AlterModelOptions operation. From what i understood :

  • First a DAG graph is created for existing migrations, then new changes are detected by comparing from_state and to_state. A Dependency is basically a tuple associated with every operation whose execution depends on other operation.

  • After comparing the states all the generated operations are sorted using a TopologicalSorter. Here the generated operation are reordered to make their execution possible.

  • After that from the whole generated list, operations are chopped from the point of an outgoing dependency and auto migrations are created here.

  • Then migration operations in auto migrations are optimized (in a way somewhat similar to squashmigrations)

  • Subsequently the graph is rearranged and migration files are written.

I must be missing something that is making addition of last AlterModelOptions faulty. I’d appreciate if you could take a look and provide some suggestions . Thanks a lot.

Hi, I have pushed some changes and it is working fine for me now. I even tried to move multiple models at once and it was working! To make it easier to understand how my patch works in different scenarios, i have created a github repo with multiple apps .
Looking forward to the feedback from the community.

Thanks!

1 Like