Combining annotated queryset from different models


Let’s say I have the following queries in a view:

foos = Foo.object.filter(...).values("no_prod").annotate(qte_a=Sum("some_qty"))
bars = Bar.object.filter(...).values("no_prod").annotate(qte_b=Sum("another_qty"))

So essentially, I have 2 different models. They both have one field that I use as an identifier (no_prod). and some other quantities. Some of those objects will overlap (e.g. both foos and bars will have the same no_prod), some may be unique to one or the other. What I’d like to do is something like:

foos_and_bars = foos.union(bars)

But I need to have qte_a, qte_b “added” into the result. My current thinking is to convert them to a list at that point & do the computations in python. Although I know that’s going to be slower with larger sets.

Any ideas?

Just winging something here - but you might be able to do something like:

foos_and_bars = Foo.object.filter(...).annotate(qte_b=Subquery(Bar.object.filter(..., no_prod=OuterRef('no_prod')).annotate(qte_b=...).values('qte_b')))

(If this works, I’d still want to take a really close look at an explain for this query to see how it’s working in practice. This may be one of those cases where doing the work by joining two Python dicts actually is more performant than requiring the database to do this.)

Hmmm, okay. Well in a way I’m glad there isn’t an obvious solution to this. I’ve been testing out a few options, but nothing that was close to working the way I need it. I’ll try and see if those more advanced Query object do the trick.

I was somewhat operating under the assumption that manipulating stuff in python is a bit of a cardinal sin of backend logic in Django. So perhaps I should soften that stance somewhat. The original queries on the system where those came from were sql-ish (but not really sql, so I can’t just use Raw(…), and really slow indeed. So perhaps a python-based approach wouldn’t be that bad. Food for thoughts.

Yes, your requirements here really are not a good fit for the models as you’ve described them. There’s a pretty good chance that if you’re encountering situations like this in multiple places in your code, you need to re-evaluate your table design.

1 Like

So, in my case, turns out the best (only?) way to do this was to make separate queries for Foo, Bar, use .values() for the fields that I want to keep in the final results from each and then .annotate(key=Sum("field")) for any computed values. Then in python, call list() on them so that I perform in this case 2 queries to the database. Then on the lists, I used list comprehension/iterations to merge/further manipulate the results to obtain what I couldn’t easily get just with the ORM.

A relevant note here - much better to call foos_list = list(foos), bars_list = list(bars) and then something like answer = [<some_computation_with_x> for x in foos_list], than to perform the list comprehension/iteration stuff on the results of Foo.object.filter(....) directly. The latter could result in multiple queries to Foo, Bar model and make and already not-so-fast query much, much slower.


@KenWhitesell’s comment about reviewing the model is spot on - the structure of the source model isn’t the best. In my case not so applicable as I’m rewritting code from another SQL-ish system using an existing data model that needs migrating.