Custom inner join using values_list ?


The gist: for performance reasons, I’m trying to have the ORM generate SQL a particular way, and I’m stuck at how to make an inner join on 2 values provided by a values_list.

Abbreviated models with only relevant fields and shorter names:

class GenericContent(models.Model):
  consumer_type = models.ForeignKey(ContentType, on_delete=models.CASCADE, limit_choices_to=allowed_consumers)
  consumer_id = models.PositiveIntegerField(db_index=True)
  consumer = GenericForeignKey('consumer_type', 'consumer_id')

class SpecificContent(GenericContent):
  picture = models.Charfield(max_length=200)
  video = models.Charfield(max_length=200)
  sound = models.Charfield(max_length=200)

class R(models.Model):

class W(models.Model):

class C(models.Model):

class O(models.Model):

class G(models.Model):

R, W, C, O and G are consumers of GenericContent. There are other types of content, but only SpecificContent is used here. My production database have about 10 million rows of Content. It’s a Standard-2 on Postgres.
I’m trying to get all Contents that match a list of consumers (there is a filter per type) by using a subquery that outputs all consumer_type_id and consumer_id couples, then do an inner join of that.

Here’s the SQL I want to generate, that I’ve tested meet my performance goal (there are other ways to write it, but other ways I’ve found are too slow):

SELECT specificcontent.picture,, specificcontent.sound
FROM specificcontent WHERE specificcontent.genericcontent_ptr_id IN
  SELECT FROM genericcontent
    SELECT R."id" as "consumer_id", 28 as "consumer_type_id" FROM R WHERE [some business-specific filters here]
    SELECT W."id" as "consumer_id", 92 as "consumer_type_id" FROM W WHERE [some business-specific filters here]
    SELECT C."id" as "consumer_id", 17 as "consumer_type_id" FROM C WHERE [some business-specific filters here]
    SELECT O."id" as "consumer_id", 11 as "consumer_type_id" FROM O WHERE [some business-specific filters here]
    SELECT G."id" as "consumer_id", 90 as "consumer_type_id" FROM G WHERE [some business-specific filters here]
  ) AS consumers
  ON genericcontent.consumer_id = consumers.consumer_id AND genericcontent.consumer_type_id = consumers.consumer_type_id

The building of the consumers subquery is crucial to performance. Using a mixture of (genericcontent.consumer_id = R.consumer_id AND genericcontent.consumer_type_id = ContentType.objects.get_for_model(R).id) OR [same for other types is not performant enough (DB query ends up taking more than 30s, which is Heroku timeout limit).

So far, I’ve managed to build the consumers subquery using annotate and values_list:

consumers = R.objects.filter([business logic goes here]).annotate(consumer_id=F('id'), consumer_type_id=Value(ContentType.objects.get_for_model(R).id)).values_list('consumer_id', 'consumer_type_id').union(
  W.objects.filter([business logic goes here]).annotate(consumer_id=F('id'), consumer_type_id=Value(ContentType.objects.get_for_model(R).id)).values_list('consumer_id', 'consumer_type_id')
).union([same for other models])

Now what I can’t figure out: how can I use the consumers as a subquery to do the INNER JOIN with Content? Is it even possible without using raw SQL?
It would be very beneficial to me to use the ORM, as I have multiple similar queries that would benefit from that speed-up.

Note: in a previous version (with performance issues) I was using this code to filter:

filters = {
  ContentType.objects.get_for_model(R).id: Q(genericcontent_ptr__consumer_id__in=R.objects.filter([business logic goes here]))
  ContentType.objects.get_for_model(W).id: Q(genericcontent_ptr__consumer_id__in=W.objects.filter([business logic goes here]))
        condition=Q(reduce(operator.or_, (Q(queryfilter & Q(genericcontent__consumer_type=content_type_id)) for content_type_id, queryfilter in filters.items()))))

But I’m at a loss on how to adapt that with the consumers subquery.