Unique constraint on M2M Through

Hi everyone! I’ve got a bit of a pickle, hopefully you can help.

First: Django 5.0.3, Python 3.10, Postgres back end

I have two Models: Criterion and Spell

class Criterion(models.Model):
    value1 = models.IntegerField()
    value2 = models.IntegerField()
    operator = models.IntegerField()

    class Meta:
        models.UniqueConstraint(
            fields = ['value1', 'value2', 'operator'],
            name = 'unique_criterion'
        )

    def __eq__(self, other):
        return self.value1 == other.value1 and self.value2 == other.value2 and self.operator == other.operator
    
    def __hash__(self):
        return super().__hash__()

class Spell(models.Model):
    target = models.IntegerField(null=True)
    tickCount = models.IntegerField(null=True)
    tickInterval = models.IntegerField(null=True)
    spellID = models.IntegerField(null=True)
    spellFormat = models.CharField(max_length=512, null=True)
    criteria = models.ManyToManyField(Criterion)
    spellParams = models.JSONField(default=list)

As you can see, Spell has a ManyToMany relationship with Criterion - a Spell has a set of Criterion. Criterions represent elements in an expression tree (basically reverse-polish notation), and so it does happen that a single Criterion may appear more than one in a Spell’s criteria.

This all works just fine if I populate the database in the “ORM way”, by creating a Spell, populating the fields, saving it, then adding Criterion objects using add(). However, it’s also very, very slow (a day or more for this data set). I’d like to speed that up by using bulk_create().

Criterion aren’t a problem, I can just make a list of them, ensure they’re unique, and bulk_create(). Spells aren’t a problem either, minus the criteria relationship. For that, I keep a list of Criterion and a list of Spell, and (thanks to resetting the auto-incrementing counters in postgres) I can predict PKs before objects are created to build a list of Spell.criteria.through objects to later bulk_create(). This works as well.

However, when I attempt to bulk_create() that list of Through objects, I get an IntegrityError: duplicate key value violates unique constraint. The error is correct, there are duplicates - because Criterion can repeat on a single Spell. I haven’t added any uniqueness constraint on the relationship, and there’s none in my migration files, but I assume ManyToManyField does that under the hood.

Using add() doesn’t have a problem with duplicate Criterion, and those duplicates show up again normally when I query Spell.criteria later. So I guess I need help understanding what else add() does to manage this relationship and honor the uniqueness constraint, and whether there’s a way I can account for that “manually” on the Through table so I can bulk_create() those objects?

Would this help?

It helps confirm I’m not totally crazy, thank you!

I’m basically doing what the solution describes there, doing bulk_create on the through model. But repeated entries on that list throw an IntegrityError.

Are you passing the ignore_conflicts option to bulk_create?

That is helpful indeed, and I saw in the generated SQL that “ON CONFLICT DO NOTHING” is being used as well. That helps me get past the IntegrityError, though now the repeated Criterion are being dropped.

I’ll likely have to do some deep diving into what’s actually happening under the hood and possibly rethink my entire approach, but I appreciate the helpful tip!

If you just need to know how many times a criterion is repeated on a spell, you could create a custom through class, and store it as an integer, thus avoiding to violate the unique constraint.
This will also make the through table smaller, and have better indexing.

The documentation for setting a custom through class is here.