when "exclude" is not negation of "filter", is it a bug ?

It seems natural to expect that the queries
Book.objects.filter(criteria) and Bool.objects.exlude(same criteria) give complementary results: each book should appear in one and only one of these two querysets. However, I am faced with a situation in which this is not the case. My question is whether this should be considered normal or whether it is a bug to report. Here’s a minimal example:

class Genre(models.Model):
name = models.CharField(max_length=100)

class Book(models.Model):
title = models.CharField(max_length=100)
genre = models.ForeignKey(Genre, on_delete=models.CASCADE)
genres = models.ManyToManyField(Genre, related_name='books')

Imagine that genres is the family of genres that properly represents the book and genre is the main genre. We want to make sure that book.genre is in book.genres.all(). We get the list of books that satisfy this condition by

Book.objects.filter(genre__in = F('genres'))

There, no problem. We hope that the following query will provide the list of books that do not satisfy it:

Book.objects.exclude(genre__in = F('genres'))

But bang: this is not at all what is expected. Let’s introduce some items in the database :

And let’s see the result:

>>> Book.objects.filter(genre__in = F('genres'))
<QuerySet [<Book: Book1>, <Book: Book2>, <Book: Book4>]>
>>> Book.objects.exclude(genre__in = F('genres')) 
<QuerySet [<Book: Book1>, <Book: Book3>, <Book: Book3>, <Book: Book4>, <Book: Book4>]>

We understand what is happening by examining the SQL query produced:

SELECT "myapp_book"."id", "myapp_book"."genre_id"
FROM "myapp_book"
LEFT OUTER JOIN
"myapp_book_genres"
ON ("myapp_book"."id" = "myapp_book_genres"."book_id")
WHERE NOT ("myapp_book"."genre_id" IN ("myapp_book_genres"."genre_id"))

which is not, it seems to me, what we expect (instead of “myapp_book_genres”.“genre_id”, at the end, we should have some subquery catching “book.objects.genres.all()” ).

So : is it a bug or not ?

Yes, I would consider this expected. The docs for exclude state:

The lookup parameters (**kwargs ) should be in the format described in Field lookups below. Multiple parameters are joined via AND in the underlying SQL statement, and the whole thing is enclosed in a NOT() .

So from my layman’s perspective (I’m far from knowledgeable about the inner workings of the ORM), I think about this as a “textual transformation” of the SQL and not a “logical inversion”.

Which is precisely what you are seeing here.

Note: This type of asymmetry can happen in other situations as well

So that is, in the general case, what you need to create in the ORM to get the desired answer. For example, you should be able to do something like this:

Book.objects.annotate(
  in_genres=Exists(Book.objects.filter(
    id=OuterRef('id'), genre__in=F('genres')
).filter(in_genres=False)

(Kinda winging this - this may not be precisely correct.)

Hi Ken,

Thanks for taking the time to answer this question and for looking for an alternative solution.
However, I’m not completely convinced. The query

Book.objects.filter(genre__in = F('genres'))

does indeed give the same SQL code, without the NOT. It turns out that this query gives the same result as the query that I would expect, but that’s by luck in a way (or bad luck because it masks the problem – at least what seems to me to be a problem).

Please clarify, what are you not convinced about?

The exclude function does exactly what it’s documented to do.

Additionally, this asymmetry between filter and exclude is also referenced in the docs for Spanning multi-valued relationships. (It’s not directly on point for the situation you describe, but it does highlight that in this type of situation the two functions do not return exclusive results.)

Oh, yes, you are right about that asymmetry example. It definitely shows that the symmetry of filter and exclude is not a rule.
I will look again at the case I am interested in in light of this example (it is a bit tricky though!) and will come back if it still seems abnormal to me.
Thanks again.