Improving Q objects with True, False, and None

Q() is often used as a starting value for building complex lookups with Q objects. It is an “empty” operation that is dropped from any expression, leading to some confusion. To make usage more explicit, I’d like to add the following special values:

  1. An “always satisfied” condition: Q(True)
  2. An “always unsatisfied” condition: Q(False)
  3. An “empty” operation that defaults to no results: Q(None)

Q(False)

Q(False) is useful as a more readable version of Q(pk__in=[]). A common mistake is the following:

def find_books(pseudonyms):
    books_filter = Q()
    for name in pseudonyms:
        books_filter |= Q(author=name)
    return Book.objects.filter(books_filter)

This result is similar to Book.objects.filter(author__in=pseudonyms), but ALL books are returned when pseudonyms is empty.

To address this, you should use Q(pk__in=[]) as a starting value. The query optimizer recognizes that this condition is always unsatisfied and handles it nicely, even for complex queries. Many people use Q(pk=None) or Q(pk__isnull=True), which aren’t handled as nicely (perhaps because pk can be overridden).

Q(True) and Q(None)

Uses for Q(True) and Q(None) are less obvious. Consider the following code:

def search_books(conditions, favourite_author=None):
    q = STARTING_VALUE # either Q(), Q(True), or Q(None)
    for condition in conditions:
        q &= condition
    if favourite_author is None:
        return Book.objects.filter(q)
    return Book.objects.filter(Q(author=favourite_author) | q)

When conditions is populated, the behaviour is always the same. The function searches for books that either match all the conditions OR are written by the (optional) favourite author. The starting value matters when conditions is empty:

STARTING_VALUE favourite_author result
Q() None All books
Q() "Tolkien" Books by Tolkien
Q(True) None All books
Q(True) "Tolkien" All books
Q(None) None Nothing
Q(None) "Tolkien" Books by Tolkien

Operations Tables

The special values and their operations can be defined at the Q object level, meaning expressions can be optimized during evaluation. This has a slight advantage over Q(id__in=[]), which is optimized-away when building the query.

Currently ~Q() and Q(Q()) are internally represented differently than Q(), but I don’t think it would ever make a difference.

AND “&

&Q() &Q(True) &Q(False) &Q(None) &Q(**k)
Q() =Q() =Q(True) =Q(False) =Q(None) =Q(**k)
Q(True) =Q(True) =Q(True) =Q(False) =Q(True) =Q(**k)
Q(False) =Q(False) =Q(False) =Q(False) =Q(False) =Q(False)
Q(None) =Q(None) =Q(True) =Q(False) =Q(None) =Q(**k)
Q(**k) =Q(**k) =Q(**k) =Q(False) =Q(**k) NA

OR “|

|Q() |Q(True) |Q(False) |Q(None) |Q(**k)
Q() =Q() =Q(True) =Q(False) =Q(None) =Q(**k)
Q(True) =Q(True) =Q(True) =Q(True) =Q(True) =Q(True)
Q(False) =Q(False) =Q(True) =Q(False) =Q(False) =Q(**k)
Q(None) =Q(None) =Q(True) =Q(False) =Q(None) =Q(**k)
Q(**k) =Q(**k) =Q(True) =Q(**k) =Q(**k) NA

Negation “~

Q() Q(True) Q(False) Q(None)
~ =Q() =Q(False) =Q(True) =Q(None)

Concerns with Q(None)

I am a bit worried that Q(None) will lead to more mistakes than it prevents.

Q(False) is a really nice way to ensure at least one OR condition exists:

def or_group(conditions):
    q = Q(False)
    for condition in conditions:
        q &= condition
    return q
Book.objects.filter(or_group([])) # empty
Book.objects.filter(or_group([]) & Q(author="Tolkien") # empty

It’s important to note that Q(None) does not always do the same for AND conditions:

# incorrect
def and_group1(conditions):
    q = Q(None)
    for condition in conditions:
        q &= condition
    return q
Book.objects.filter(and_group1([])) # empty
Book.objects.filter(and_group1([]) & Q(author="Tolkien") # not empty

# correct
def and_group2(conditions):
    q = Q()
    for condition in conditions:
        q &= condition
    return q | Q(False)
Book.objects.filter(and_group2([])) # empty
Book.objects.filter(and_group2([]) & Q(author="Tolkien") # empty

The main benefit of Q(None) is that you can use wherever you’d use Q() unless you explicitly want the Book.objects.filter(Q()) behaviour.

I definitely see the use for Q(True) and Q(False) here, and it is convenient that both could be implemented as a basic alias at first, rather than needing the internals of Q to be reworked to implement proper optimization.

I don’t think I like Q(None) - as you say, it sort of lets you shoot yourself in the foot very easily.

On top of all of this, though, there is the argument that you should be putting the Q objects into a list and then using reduce(operator.and_, my_list) - it removes the need for a “special” initial value directly.

I think we should probably consider Q(True) and Q(False) for Django if there’s evidence people are reinventing these themselves everywhere, but we should also consider an alternative API like Q.any(Q, Q, Q) and Q.all(Q, Q, Q) to match how core Python handled this specific issue.

1 Like

It’s been asked a few times on stack overflow. A lot of people seem to be using Q(pk__isnull=True). I commented on a few and even got an accepted answer changed.
(I’d link directly but I can only post 2 links)

  1. /35893867/always-false-q-object
  2. /29900386/how-to-construct-django-q-object-matching-none
  3. /33517468/always-true-q-object
  4. /31160994/the-right-way-to-make-q-object-which-filter-all-entries-in-django-queryset
  5. /20222457/django-building-a-queryset-with-q-objects

And it looks like most people are chaining ORs in a for loop with Q() as a starting value:

  1. https://stackoverflow.com/questions/13076822/django-dynamically-filtering-with-q-objects
  2. https://stackoverflow.com/questions/852414/how-to-dynamically-compose-an-or-query-filter-in-django

I’m not sure I buy the reduce argument. You still need an initial value unless you want TypeError: reduce() of empty sequence with no initial value
I could see Q.any(Q, Q, Q, default=) and Q.all(Q, Q, Q, default=). Defaults for empty inputs might be easier for users to grok than reduce’s initial values.

Yeah, reduce is also not something I recommend to most developers as it’s non-obvious and requires a little bit of functional knowledge.

That said, I do think a Q.all() or Q.any() would be more bullet-proof. But we should probably also have at least an empty-Q (False in your example).

Q.any() and Q.all() feels very intuitively right to me, like it’s something I’d enjoy (and more importantly remember) to use. I’ve definitely been at the awkward "construct a Q object in a loop point a couple of times, and just passing an iterable would have been way less of a mental strain compared with the current way of doing it.

Any preference for exact parameter behaviour on Q.any() and Q.all()?

The main input could be:

  1. An iterable. Similar to reduce.
    Sample a: Q.any([Q(foo=True), Q(bar=False)])
    Sample b: Q.any(q_list)
  2. Many args. Similar to ManyToManyField.add().
    Sample a: Q.any(Q(foo=True), Q(bar=False))
    Sample b: Q.any(*q_list)

Then there’s the default:

  1. Q(). Logically follows from the empty syntax Q.any() or Q.all().
  2. TypeError on empty input when no default is given. Similar to reduce.
  3. Always require a user-provided default.

I’d avoid 1 because of the mistake with pseudonyms I documented above. I don’t really like edge-case exceptions, so I’d prefer 3 over 2, but that might be frustrating for people.

I would match the built-in Python any and all as closely as possible - take in a single iterable.

As for your second question - when you say default, do you mean “what happens if you don’t provide any arguments?”. I would say:

  • No argument provided: TypeError
  • Single, empty list provided: They select no results at all and return an empty queryset

Thanks for the detailed write-up @jonathan-golorry. This would be useful - is there a ticket for it yet? :slight_smile:

It’s been a while, but I’m finally getting around to this.

I think Q.TRUE and Q.FALSE as class property aliases of ~Q(pk__in=[]) and Q(pk__in=[]) makes the most sense for now. It would be nice to just use Q.all() and Q.none(), but there would be confusion around Q.none() not being the same as ~Q.any([]).

It’s a bit weird that bool(Q.FALSE) is True, but that’s because only an empty Q() evaluates to False. ALWAYS/NEVER or EVERYTHING/NOTHING might avoid that, but it’s easier for me to think about the logical combinations using TRUE/FALSE.

I could add optimizations for simplifying combining logic, but the ORM handles that when creating the query anyway:

>>> print(get_user_model().objects.filter(Q(pk__in=[]) & Q(pk=7)).query)
...
django.core.exceptions.EmptyResultSet
>>> print(get_user_model().objects.filter(~Q(pk__in=[]) | Q(pk=7)).query)
SELECT "auth_user"."id", ... "auth_user"."date_joined" FROM "auth_user"

Q.any(iterable) and Q.all(iterable) are fairly simple. There’s a weird edge-case with iterables that only contain empty Q() objects. Those iterables will also be considered “empty”. That way any and all NEVER return an empty Q() object. For example, Q.any([Q(), Q()]) will still return Q(pk__in=[]).

I can see someone eventually running into a bug where they’ve checked that their input iterable isn’t length 0 and can’t figure out why they’re getting Q(pk__in=[]), but that’s better than accidentally exposing your entire table.

Edit: Actually, Q(Q()) can still cause problems. It evaluates to True and bypasses the logic checking for empty Q() objects. Simple to fix, but probably belongs in a separate ticket.

Edit2: And my Q(Q()) fix is revealing problems with deconstructing Q objects of query expressions. That might end up being it’s own ticket as well…

Ticket is up #32554 (Add Q.empty(), Q.TRUE, Q.FALSE, Q.any(), and Q.all()) – Django

The ticket has been closed as wontfix, so it won’t get added to django unless there’s an argument for reopening it.