Add ability to define custom column names in the table created by the ManyToManyField

If your company has strict database design guidelines, it is an extra burden to adhere to them where the ManyToManyField() is concerned because you must define a through model in order to have custom column names in the through table.

You can define a custom table name with the db_table argument. However, there is no way to define what the column names will be in that table unless you go to the trouble of defining a custom through Model, which adds extra code and somewhat defeats the purpose of the ManyToManyField(), especially, when you don’t need additional fields in your through table.

This also breaks the convenient .set(), .add(), .create() methods on the ManyToManyField() instance when it shouldn’t break because you haven’t added extra fields on the through table.

In the following example, Django’s default name for the foreign key in the through table would be shoppingcart_id, however, this doesn’t work if your company’s style guide wants the name to be shopping_cart_id.

One solution would be to support custom column names with arguments like from_column_name and to_column_name in the following example.

class Item(models.Model):
    name = models.CharField(max_length=255)

    class Meta:
        db_table = "item"


class ShoppingCart(models.Model):
    user = models.ForeignKey(settings.AUTH_USER_MODEL, on_delete=models.CASCADE)
    items = models.ManyToManyField(Item, from_column_name="shopping_cart", to_column_name="item")

    class Meta:
        db_table = "shopping_cart"

Another idea would be to use the existing ManyToManyField.through_fields argument for defining custom column names. For example:

items = models.ManyToManyField(Item, through_fields=("shopping_cart", "item"))

Another idea would be to have a setting in a project that defaults to breaking up words with underscores in the through table column names, however, this wouldn’t allow for as much flexibility and wouldn’t work for an existing project.

Instead, what you have to do now is create a through model like the following:

class Item(models.Model):
    name = models.CharField(max_length=255)

    class Meta:
        db_table = "item"


class ShoppingCart(models.Model):
    user = models.ForeignKey(settings.AUTH_USER_MODEL, on_delete=models.CASCADE)
    items = models.ManyToManyField(Item, through="ShoppingCartItem")

    class Meta:
        db_table = "shopping_cart"


class ShoppingCartItem(models.Model):
    shopping_cart = models.ForeignKey(ShoppingCart, on_delete=models.CASCADE)
    item = models.ForeignKey(Item, on_delete=models.CASCADE)

    class Meta:
        db_table = "shopping_cart_item"

I would be happy to take a stab at creating a PR if others familiar with the ORM would point me in the right direction and if we could get a design decision on what method to go with.

1 Like

This used to be true. I’m not sure anymore.

From the docs (Models | Django documentation | Django):

You can also use add(), create(), or set() to create relationships, as long as you specify through_defaults for any required fields:

I suspect if you have no other non-defaulted fields that you don’t have to specify the through_defaults.

-1 from me. I don’t think we should customize ManyToManyField to support one niche requirement when there’s already a way to do so (regardless of whether add() etc. work, they’re nice-to-have shortcuts). If we did this for other requirements, we’d end up replicating the model interface in ManyToManyField arguments. The db_table argument may be too much already for my taste, but we can’t remove that.

From a philosophical standpoint, I understand where you’re coming from, but from a practical perspective on large projects, this is a terrible DX. It took an extra week to refactor our models to conform to our database style guide when it would have taken less than a day if this feature were in place. We also had to update API abstractions to handle the case when add(), create(), and set() weren’t available.

If you are following your philosophy fully, you should advocate for removing the ManyToManField and require everyone to define those relationships explicitly all the time, which, in my opinion, would not improve the developer experience.

You xould aslo do this customisation you like with a ManyToManyField subclass’s that changes contribute_to_class to call a wrapper/adjustedd version of the function that creates the through model: https://github.com/django/django/blob/19c4052f98e5dc4fe9d7edd7125df6a66efbd79f/django/db/models/fields/related.py#L1322 .

@adamchainz, thank you for your suggestion. I was able to get a POC working. It required subclassing the ManyToManyField() as you suggested; however, I had to copy the entire contribute_to_class() method only to change a couple of lines, and then I also had to create a custom create_many_to_many_intermediary_model() function, only to change a few lines from the original class and function. Doing this adds a lot of Django source code to manage in a project, which would make Django upgrades more difficult.

Adding these changes to Django makes more sense since it’s only a few lines of code and aligns with the already established API. Since you can already define a custom column name on a ForeignKey() with db_column, it makes logical sense that Django would support this on a ManyToManyField().

For context, this is my POC, where I’ve added comments that start CHANGED FROM ORIGINAL to identify what was changed.

I would be happy to create a PR complete with tests and documentation!

App models.py file

from django.conf import settings
from django.db import models

from apps.base.model_fields import CustomColumnNameManyToManyField


class Item(models.Model):
    name = models.CharField(max_length=255)

    class Meta:
        db_table = "item"


class ShoppingCart(models.Model):
    user = models.ForeignKey(settings.AUTH_USER_MODEL, on_delete=models.CASCADE)
    items = CustomColumnNameManyToManyField(
        Item,
        db_table="shopping_cart_item",
        db_from_column_name="shopping_cart",
        db_to_column_name="item",
    )

    class Meta:
        db_table = "shopping_cart"

Custom model_fields.py

from functools import partial

from django.core.exceptions import ImproperlyConfigured
from django.db.models.deletion import CASCADE
from django.db.models.utils import make_model_tuple
from django.utils.translation import gettext_lazy as _

from django.db.models.fields.related import (
    resolve_relation,
    lazy_related_operation,
    RECURSIVE_RELATIONSHIP_CONSTANT,
    RelatedField,
)
from django.db.models.fields.related_descriptors import (
    ManyToManyDescriptor,
)

from django.db.models import ManyToManyField


def create_custom_column_name_many_to_many_intermediary_model(field, klass):
    from django.db import models

    def set_managed(model, related, through):
        through._meta.managed = model._meta.managed or related._meta.managed

    to_model = resolve_relation(klass, field.remote_field.model)
    name = "%s_%s" % (klass._meta.object_name, field.name)
    lazy_related_operation(set_managed, klass, to_model, name)

    # CHANGED FROM ORIGINAL: The following two lines are changed from the original django create_many_to_many_intermediary_model function
    to = getattr(field, "_to_column_name") or make_model_tuple(to_model)[1]
    from_ = getattr(field, "_from_column_name") or klass._meta.model_name
    if to == from_:
        to = "to_%s" % to
        from_ = "from_%s" % from_

    meta = type(
        "Meta",
        (),
        {
            "db_table": field._get_m2m_db_table(klass._meta),
            "auto_created": klass,
            "app_label": klass._meta.app_label,
            "db_tablespace": klass._meta.db_tablespace,
            "unique_together": (from_, to),
            "verbose_name": _("%(from)s-%(to)s relationship")
            % {"from": from_, "to": to},
            "verbose_name_plural": _("%(from)s-%(to)s relationships")
            % {"from": from_, "to": to},
            "apps": field.model._meta.apps,
        },
    )
    # Construct and return the new class.
    return type(
        name,
        (models.Model,),
        {
            "Meta": meta,
            "__module__": klass.__module__,
            from_: models.ForeignKey(
                klass,
                related_name="%s+" % name,
                db_tablespace=field.db_tablespace,
                db_constraint=field.remote_field.db_constraint,
                on_delete=CASCADE,
            ),
            to: models.ForeignKey(
                to_model,
                related_name="%s+" % name,
                db_tablespace=field.db_tablespace,
                db_constraint=field.remote_field.db_constraint,
                on_delete=CASCADE,
            ),
        },
    )


class CustomColumnNameManyToManyField(ManyToManyField):
    # CHANGED FROM ORIGINAL: The following init method was added from original ManyToManyField() class
    def __init__(
        self, *args, db_from_column_name=None, db_to_column_name=None, **kwargs
    ):
        if db_from_column_name is None or db_to_column_name is None:
            raise ImproperlyConfigured(
                "CustomColumnNameManyToManyField requires that you specify either db_from_column_name, db_to_column_name, or both."
            )
        self._from_column_name = db_from_column_name
        self._to_column_name = db_to_column_name
        super().__init__(*args, **kwargs)

    def contribute_to_class(self, cls, name, **kwargs):
        # To support multiple relations to self, it's useful to have a non-None
        # related name on symmetrical relations for internal reasons. The
        # concept doesn't make a lot of sense externally ("you want me to
        # specify *what* on my non-reversible relation?!"), so we set it up
        # automatically. The funky name reduces the chance of an accidental
        # clash.
        if self.remote_field.symmetrical and (
            self.remote_field.model == RECURSIVE_RELATIONSHIP_CONSTANT
            or self.remote_field.model == cls._meta.object_name
        ):
            self.remote_field.related_name = "%s_rel_+" % name
        elif self.remote_field.is_hidden():
            # If the backwards relation is disabled, replace the original
            # related_name with one generated from the m2m field name. Django
            # still uses backwards relations internally and we need to avoid
            # clashes between multiple m2m fields with related_name == '+'.
            self.remote_field.related_name = "_%s_%s_%s_+" % (
                cls._meta.app_label,
                cls.__name__.lower(),
                name,
            )

        # CHANGED FROM ORIGINAL: The following line was changed from the original django ManyToManyField class
        RelatedField.contribute_to_class(self, cls, name, **kwargs)

        # The intermediate m2m model is not auto created if:
        #  1) There is a manually specified intermediate, or
        #  2) The class owning the m2m field is abstract.
        #  3) The class owning the m2m field has been swapped out.
        if not cls._meta.abstract:
            if self.remote_field.through:

                def resolve_through_model(_, model, field):
                    field.remote_field.through = model

                lazy_related_operation(
                    resolve_through_model, cls, self.remote_field.through, field=self
                )
            elif not cls._meta.swapped:
                self.remote_field.through = (
                    # CHANGED FROM ORIGINAL: The following line was changed from the original django ManyToManyField class
                    create_custom_column_name_many_to_many_intermediary_model(self, cls)
                )

        # Add the descriptor for the m2m relation.
        setattr(cls, self.name, ManyToManyDescriptor(self.remote_field, reverse=False))

        # Set up the accessor for the m2m table name for the relation.
        self.m2m_db_table = partial(self._get_m2m_db_table, cls._meta)

    def deconstruct(self):
        # CHANGED FROM ORIGINAL: This method was added from the original django ManyToManyField class
        name, path, args, kwargs = super().deconstruct()
        if self._from_column_name is not None:
            kwargs["db_from_column_name"] = self._from_column_name
        if self._to_column_name is not None:
            kwargs["db_to_column_name"] = self._to_column_name
        return name, path, args, kwargs
5 Likes