I’m puzzled by Django behavior when using NOT IN query to filter queryset by value of a field from related nullable relationship.
If we have the following models:
class Vendor(models.Model):
name = models.CharField(max_length=128)
class Asset(models.Model):
vendor = models.ForeignKey(to=Vendor, blank=True, null=True, on_delete=models.PROTECT, related_name=‘vendor’)
And attempt to find all Assets that do not match vendor name ‘Unknown’, Django will generate something along these lines:
In [1]: l = list(Asset.objects.exclude(vendor__name__in=['unknown']))
SELECT "app_asset"."id"
FROM "app_asset"
LEFT OUTER JOIN "app_vendor"
ON ("app_asset"."vendor_id" = "app_vendor"."id")
WHERE NOT ("app_vendor"."name" IN ('unknown') AND "app_vendor"."title" IS NOT NULL)
If we have Assets with vendor = None, such query will filter out all such rows, which is obviously not what the user would want (excluding only Assets with the specific vendor name).
I’ve spent hours googling and reading Django sources, but I still do not understand why extra AND field IS NOT NULL clause is added and how can I suppress it.
I’ve added workaround to my app by using additional OR field IS NULL clause, but want to get to the bottom of the issue - maybe Django adds this clause for a reason. Does anybody know?