[GSoC 2021] Adapt Schema Editors to operate from model states instead of fake rendered models

Hello everyone!
This discussion is for my GSoC project.

Communication

I have setup a documentation blog here. I will document all my GSoC-related work in this blog. I will also add my plans and their progress in the same blog.
For communication with my mentor and other Django Fellows, I will prefer using either Django Forum or django-developers mailing list.

Schedule

I would like to stick to a plan and before starting with anything I would like to plan things up. Also, I would like to continue with the plan mentioned in my project proposal.

Major Milestones:
  1. Creating and Populating Central Registry
  2. Adapting Central Registry and ModelState
  3. Working on tests and documentation
Task for the upcoming 1.5 weeks ( May 22 - June 1)
  • Creating the proxy for all Operation subclasses’ state_forwards and state_backwards methods in ProjectState.
  • Initialization of the Central Registry in ProjectState(like ProjectState.add_field() etc.).
  • Code the logic to populate the registry with ProjectState.add_model() and newly introduced ProjectState.add_field() .
  • Fixing and Writing tests along with Documentation. (if required)

I am starting with proxying all Operation subclasses state_forwards and state_backwards methods in ProjectState(Of course, if my mentor prefers).

Any suggestions, views, thoughts or feedback is most welcome. Also, I would like to know the community’s expectations from me.

Working this summer, optimizing the Migrations Framework is going to be way more interesting. It’s more of like once in a lifetime experience.

3 Likes

I have created a PR to create a Proxy of all state_forwards method in ProjectState class.
PR

1 Like

While writing the code for the central registry in ProjectState. I came across a problem which is mentioned below. I am trying to figure out the solution for the same. I would be glad if anyone may help me out with this.

Problem

To initialize the central registry in django.db.migrations.state.ProjectState.add_model I need the following parameters

  • from_app_label
  • from model_name
  • from field
  • to model_name
  • to app_label
  • to_field

Out of all these required values I figured out how to get the first 5 values as follows:

  • from_app_label - model_State.app_label
  • from model_name - model_state.name_lower
  • from field - model_state.fields.items()
  • to model_name - model_state.fields.items().field.remote_field.model
  • to app_label - model_state.fields.items().field.remote_field.model.split(".")

But I am unable to figure out how can I get to_field in case the to_field is not explicitly defined by the developer.

Suggestions and feedback would be highly appreciated.

Regard,
Manav

I have solve the above mentioned problem But I am stuck with how will we store manytomany relations in the registry? Like for Foreign Key relations we may use the following method:

project_state.related_fields_registry[app_label, model_name]:[(from_field,app_label,model_name,to_field),...]

But as in case of manytomany field we have through model so how can we manage that?
I would like to ping @felixxm @MarkusH @charettes for the solution.

Suggestions and feedback would be highly appreciated.

Would it not be possible for you to create the through models (replicating what the actual models do?) with more or less the same code? It would probably be easy to factor out at least the through model/field names to share that between this and the real code. I imagine the big thing you want is those names, right?

So a ManyToMany declaration ends up also creating the model in your add_model/add_field code. If I’m not mistaken that’s basically what happens in real Django code.

1 Like

Looks you’ve resolved this issue but when not specified it must default to the primary key of the referenced model.

@rtpg could work and might even make adapting the schema editor a bit easier as the latter wouldn’t have to care about whether or not such models were auto-created by the framework.

As for the data structure itself it will likely need to be adapted to store both through and through_fields. It might not be necessary in the first place if you populate the registry with through models as @rtpg suggested though as you might be able to solely rely on the implicitly created foreign keys to deduce relationships.

2 Likes

I think it will take some time to digest the suggestions by @rtpg @charettes. I also need to dig more into the code in order to have something concrete in mind for the same.

Thank You @rtpg and @charettes for your suggestions. I have some code in my mind for add_model in django.db.migrations.state.ProjectState which is as follows.

        for name, field in model_state.fields.items():
            if field.is_relation:
                from_app_label = model_state.app_label
                from_model = model_state.name_lower
                from_model = name
                to_app_label = field.remote_field.model.split(".")[0]
                to_model_name = field.remote_field.model.split(".")[1]
                # to_field would be the through model name in case of m2m and to_fields in case of foreign keys
                to_field = field.remote_field.through or\
                           "_".join((model_state.name_lower, name.lower()))\
                            if field.many_to_many else field.to_fields
                try:
                    self.related_fields[(model_state.app_label, model_state.name_lower)].\
                        append((from_model, to_app_label, to_model_name, to_field))
                except KeyError:
                    self.related_fields[(from_app_label, from_model)] = \
                        [(from_model, to_app_label, to_model_name, to_field)]

This code is to populate the central registry by add_model. If that seems fine I may continue with populating the registry with other functions as well.
Any improvements or suggestions for a better implementation would be appreciated.