Bulk_create with UniqueConstraint

When trying to bulk_create with update_conflicts=True and a UniqueConstraint on the model, I get duplicate entries.

Here’s the model:

class TripSamsara(models.Model):
    driver_1 = models.ForeignKey(Driver, on_delete=models.CASCADE, related_name='driver_1_trips_samsara', null=True, blank=True)
    driver_2 = models.ForeignKey(Driver, on_delete=models.CASCADE, related_name='driver_2_trips_samsara', null=True, blank=True)
    truck = models.ForeignKey(Truck, on_delete=models.CASCADE, null=True)
    trailer = models.ForeignKey(Trailer, on_delete=models.CASCADE, null=True)
    start_location = models.CharField(max_length=255, null=True, blank=True)
    end_location = models.CharField(max_length=255, null=True, blank=True)
    start_date = models.DateTimeField(null=True, blank=True)
    end_date = models.DateTimeField(null=True, blank=True)
    distance_meters = models.IntegerField(null=True, blank=True)
    fuel_consumed_ml = models.IntegerField(null=True, blank=True)
    start_odometer = models.IntegerField(null=True, blank=True)
    end_odometer = models.IntegerField(null=True, blank=True)
    
    class Meta:
        verbose_name_plural = 'Trips Samsara'
        constraints = [
            models.UniqueConstraint(fields=['driver_1', 'driver_2', 'truck', 'start_odometer', 'end_odometer'], name='unique_trip_samsara')
        ]

Here’s the function:

samsara_created_trips = TripSamsara.objects.bulk_create(
                            trips_bulk_create,
                            update_conflicts=True,
                            unique_fields=['driver_1', 'driver_2', 'truck', 'start_odometer', 'end_odometer'],
                            update_fields=['trailer', 'start_location', 'end_location', 'distance_meters', 'fuel_consumed_ml', 'start_date', 'end_date']
                        )

The db table seems to have the unique constraint as well:

What am I missing?

Do any of the fields contain NULL values? I see a few of the fields included in your unique constraints are nullable (null=True).

If any member of a composite unique constraint is NULL then any tuple containing them will be considered distinct because NULL != NULL in SQL.

In other words when taking null values into consideration your unique constraint definition is equivalent to

UniqueConstraint(
    fields=['driver_1', 'driver_2', 'truck', 'start_odometer', 'end_odometer'],
    condition=Q(
        driver_1__isnull=False,
        driver_2__isnull=False,
        truck__isnull=False,
        start_odometer__isnull=False,
        end_odometer__isnull=False,
    )
    name='unique_trip_samsara',
)

With the exception that the lack of explicit condition, as you’ve defined it, stores the null-containing tuples in the index b-tree and thus the latter can be used for their efficient retrieval.

A way around that that I know of is using the NULLS NOT DISTINCT option of Postgres 15+ which Django supports since 5.0 or to use an expression index that COALESCE null values into sentinel to denote emptyness.

For example

UniqueConstraint(
    expression=Func(
        Coalesce("driver_1", 0),
        ...,
        Coalesce("end_odometer", 0),
        function="ROW"
    )
    name='unique_trip_samsara',
)
1 Like

Thank you, Simon!

I think you’re right – most of the entries in the database have the field ‘driver_2’ null. I’m currently on Django 4.2 and would like to avoid an upgrade, at least at this point in time.

I’ll check to see if removing the ‘driver_2’ from the unique constraint will still make entries unique. Otherwise, I’m looking at generating a UUID5 for each combination and using that as an ID.