Puzzling side effects from combining QuerySets with | and &

Hello, everyone!

I have faced a weird issue combining query sets.
In my app I construct few simple querysets, then combine them to send my users certain notifications.
Before sending them, I filter them out excluding users which already have received such a notification.

At some point I noticed, that one of my querysets was not filtering properly, while the others did, and users started getting duplicate notifications. Looking intensely into the code gave nothing — all looked fine.

But finally I managed to isolate the issue — and it is incredibly puzzling.

It seems like combining the querysets mutates them somehow, and further use of those queries gives a different result.

Sounds weird, but here is a dive into a code with which I reproduce the issue.

First, here is the code

# Here are three basic QuerySets I use further down
are_active = SiteUser.objects.filter(computer__last_active__gte=day_ago).distinct()
got_money = SiteUser.objects.filter(
    Exists(PayoutRequest.objects.filter(
        requester=OuterRef('pk'),
        status=PayoutRequest.STATUS_SUCCESS
    ))
).distinct()
foreigners = SiteUser.objects.exclude(language='en').exclude(country__in=en_countries).distinct()

#### Here I combine queries in certain ways and send notifications to them
# I have removed all irrelevant code and only left the parts that affect the result
# Note, how this queries are NOT modified and are NOT used in further affected code
A = are_active & (foreigners | got_money)
B = (are_active & foreigners)
#####

# Here is the part that gets affected.
# 1. I combine queries to get those who `need_help`
# 2. I find users that already received this type of Notification
# 3. I exclude those users from the `need_help`
need_help = are_active.exclude(pk__in=got_money).exclude(pk__in=foreigners)
notified = Notification.objects.filter(
    user__in=need_help_with_payouts,
    type=Notification.TYPE_HELP
).values_list('user_id', flat=True)
to_notify = need_help.exclude(id__in=notified).values_list('id', flat=True)

# And here I finally evaluate and check how many users I selected to send Notifications to.
print(len(to_notify))

The problem

I get different len(to_notify) depending if the code between “####” is executed or not.

  • If the code IS NOT present — len(to_notify) == 22, which is correct.
  • If the code IS present — len(to_notify) == 3456, which is incorrect result.
    3456 is the total count of need_help.count().
    to_notify was not filtered properly, because notified returned zero records!

It’s already crazy that combining querysets have affected results down the road.
But even crazier is that only executing both of them leads to invalid results!

  • Executing A and B lines — len(to_notify) == 3456
  • Not executing A and B lines — len(to_notify) == 22
  • Executing only A line — len(to_notify) == 22
  • Executing only B line — len(to_notify) == 22

How could this happen? Combining querysets generate new queryset. Why does combining affect the querysets which are being combined?

Could this be a bug in django ORM? Or am I missing something?
Any help will be appreciated, I would be happy to provide additional details.

My first recommendation for something like this is to print out the SQL representation of the queryset to see what Django is generating.

So if you were to print are_active.query, got_money.query, foreigners.query and then A.query and B.query, you would see how Django is combining these expressions.

Side note: In 10 years of using Django, I don’t think I’ve ever actually seen someone using the & or | operators on complete querysets. I know it’s available, but I’ve only ever used the Q objects for this type of operation.

Hello, Ken! Thank you for your reply.

Please note, that I am not using this combined queries A and B in affected code at all.
So I am not concerned with how are they got combined together.

I AM concerned with how are_active, got_money and foreigners have seemingly changed before and after generating A and B.

Once again:

are_active = SiteUser.objects.filter(...
got_money = SiteUser.objects.filter(...
foreigners = SiteUser.objects.exclude(...

A = are_active & (foreigners | got_money)
B = (are_active & foreigners)

# Code using are_active, foreigners, got_money is further affected. Not code using A or B

I can even run it like this without binding to reference, and it still affects the code:

...
are_active & (foreigners | got_money)
(are_active & foreigners)
... # Here code affected

I tried printing queries like this and found that they does not change:

print(are_active.query)
print(foreigners.query)
print(got_money.query)
A = are_active & (foreigners | got_money)
B = (are_active & foreigners)
print(are_active.query)  # These are exactly the same
print(foreigners.query)
print(got_money.query)

Well, it does seem like a bit niche thing. It really seemed like a proper instrument to apply in my case. At least the code is pretty clear and I am basically combine Sets of the same object in a different manner.

Speaking of sets, I ended up evaluating it in memory, like this:

need_help = set(are_active.exclude(pk__in=got_money).exclude(pk__in=foreigners))
notified = set(Notification.objects.filter(type=Notification.TYPE_YOUTUBE).values_list('user_id', flat=True))
to_notify = need_help.difference(notified)

This always results in a proper set of users to notify.

Let me see if I understand, trying to simplify this even more:

If you have:

You get different results for the lengths of are_active and foreigners than if you run:

are_active = SiteUser.objects.filter(computer__last_active__gte=day_ago).distinct()
foreigners = SiteUser.objects.exclude(language='en').exclude(country__in=en_countries).distinct()
B = (are_active & foreigners)
print(len(B))
print(len(are_active))
print(len(foreigners))

AND you get different results for the lengths of are_active and foreigners than if you run:

are_active = SiteUser.objects.filter(computer__last_active__gte=day_ago).distinct()
foreigners = SiteUser.objects.exclude(language='en').exclude(country__in=en_countries).distinct()
B = (are_active & foreigners)
print(len(are_active))
print(len(foreigners))
print(len(B))

Not quite. Your scenarios are unable to reproduce this problem.

I was trying to make a small django app with reproducible test environment, but so far I was unable to reproduce it in isolation.

So far this is the minimal code with which I am still able to reproduce the issue:

are_active = SiteUser.objects.filter(computer__last_active__gte=day_ago).distinct()
got_money = SiteUser.objects.filter(
    Exists(PayoutRequest.objects.filter(requester=OuterRef('pk')))
).distinct()
foreigners = SiteUser.objects.exclude(language='en').distinct()

####
# A = are_active & (foreigners | got_money)
#####

need_help = are_active.exclude(pk__in=got_money)
notified = Notification.objects.filter(user__in=need_help).values_list('user_id', flat=True)
print(len(notified))
# 302321
# Correct

Uncommenting A, all else is intact

are_active = SiteUser.objects.filter(computer__last_active__gte=day_ago).distinct()
got_money = SiteUser.objects.filter(
    Exists(PayoutRequest.objects.filter(requester=OuterRef('pk')))
).distinct()
foreigners = SiteUser.objects.exclude(language='en').distinct()

####
A = are_active & (foreigners | got_money)
#####

need_help = are_active.exclude(pk__in=got_money)
notified = Notification.objects.filter(user__in=need_help).values_list('user_id', flat=True)
print(len(notified))
# 0
# Incorrect

Ok, I think I narrowed it even further.

are_active = SiteUser.objects.all().distinct()
got_money = SiteUser.objects.filter(
    Exists(PayoutRequest.objects.filter(requester=OuterRef('pk')))
).distinct()
whatever_query = SiteUser.objects.all().distinct()
 
need_help = are_active.exclude(pk__in=got_money)
notified = Notification.objects.filter(user__in=need_help).values_list('user_id', flat=True)
print(len(notified))
 
(whatever_query | got_money)  # Touch with any query
 
need_help = are_active.exclude(pk__in=got_money)
notified = Notification.objects.filter(user__in=need_help).values_list('user_id', flat=True)
print(len(notified))

8066236  # Before touch
0  # After touch

It seem to only happen with got_money query, which implements Exists with OuterRef :thinking:

SQL of notified before “touch”:

SELECT 
  "notifications_notification"."user_id" 
FROM 
  "notifications_notification" 
WHERE 
  "notifications_notification"."user_id" IN (
    SELECT 
      DISTINCT W0."id" 
    FROM 
      "stats_siteuser" W0 
    WHERE 
      NOT (
        W0."id" IN (
          SELECT 
            DISTINCT V0."id" 
          FROM 
            "stats_siteuser" V0 
          WHERE 
            EXISTS(
              SELECT 
                1 AS "a" 
              FROM 
                "billing_payoutrequest" U0 
              WHERE 
                U0."requester_id" = (V0."id") 
              LIMIT 
                1
            )
        )
      )
  )

After “touch”

SELECT 
  "notifications_notification"."user_id" 
FROM 
  "notifications_notification" 
WHERE 
  "notifications_notification"."user_id" IN (
    SELECT 
      DISTINCT U0."id" 
    FROM 
      "stats_siteuser" U0 
    WHERE 
      NOT (
        U0."id" IN (
          SELECT 
            DISTINCT U0."id" 
          FROM 
            "stats_siteuser" U0 
          WHERE 
            EXISTS(
              SELECT 
                1 AS "a" 
              FROM 
                "billing_payoutrequest" U0 
              WHERE 
                U0."requester_id" = (U0."id") 
              LIMIT 
                1
            )
        )
      )
  )

Seems like the first SQL correctly separates subqueries using W0, U0, V0 aliases, while the second one uses a single U0 alias for all subqueries, leading to incorrect results.

Looks like a bug to me.

1 Like

@KenWhitesell Ken, could you please be kind and guide me a bit further on how do I submit such a bug to a django team? Where do I start? There are probably some guides on that. Could you help me out?

Oh, sure! (I didn’t know you were unfamiliar with the process.)

First, create an account for Django. See the first paragraph at Getting involved for your two options.

Then, read Reporting bugs. (You probably don’t need to do the first bullet and you already have done the second.)
You might want to search through the tickets to see if there’s one similar. If you find one that’s close-but-not-quite the same, make a note of it - you’ll want to reference it in your bug report.

When you write up your ticket, I suggest you focus on the minimal demonstration of your last few posts here. You may also want to include the minimal classes needed to be able to run these queries. (e.g., SiteUser, PayoutRequest, and Notification). Only include those fields necessary for the queries to work.

(Side note: Personally, I’d also suggest that you set up a test project that only contains your sample models, and run your example through it to verify what’s happening and that you can recreate it. If you can’t recreate it in a more constrained environment, then the people looking at it aren’t going to be able to do it either. I’ve also discovered in the past that doing this can uncover a previously unseen issue. And no, I’m not talking about you and your case here. I’ve had situations where I’ve spend hours creating a test case to demonstrate a bug, only to have it show me where the error was.)

You may also want to reference this thread in your report. Make sure all the crucial information is included in your report, but add this for additional information.

Finally, be aware that even if you do take all reasonable care in preparing your report, you may still be asked for additional information or a clarification. But don’t worry, the Django Fellows are great people and they’re there to help.

1 Like

Thank you, Ken, you are incredibly informative and helpful.

Almost always when having problems with Django I stumble on one of the threads here on this forums, and it’s alway you providing deep and thoughtful answers in the reply. Thank you for what you are doing for the community :raised_hands:

A confirmed bug report in a tracker with a solution: #35099 (Combining QuerySets with "|" or "&" mutates right-hand side.) – Django

In case someone stumbles on this thread and would like to find a remedy.

1 Like