How to overcome N+1 queries on one-to-many relation

I’m trying to use Django ORM to access data of the DB I do not control on RO mode.

Tables look this way:

RunAttribute -> Run <- TestRun - TestMetrics

-> - many to one
- one to one

In RunAttribute I’m interested only in a single attribute, so it is one-to-many relationship that for my case is in fact one-to-zero-or-one. But the problem holds true even if I would need to get multiple attributes - to treat them as extra fields of Run model.

First I tried to use subqueries:

    my_attr = RunAttribute.objects.filter(
        run=OuterRef('run'),
        name='my_attr',
    )[:1]

    qs = (
        TestRun.objects.select_related('run', 'test_metrics')
        .annotate(my_attr=labels_subquery.values('value'))
        .filter(
            my_attr__isnull=False,
            # some other run__attr= filters
        )
    )

The problem is that it results in SELECT field1, ..., fieldN, SELECT (...) FROM ... - and it resembles the problem from this thread - how to avoid N+1 queries then?

I also tried to override get_queryset() of Run model to base it on RunAttribute model instead, but it looks like it doesn’t work - it is not used by TestRun.objects.select_related('run') and, moreover,
filtering by run__my_addr__isnull=False doesn’t work:

class RunManager(models.Manager):
    def get_queryset(self):
        return (
            RunAttribute.objects
                .filter(name='my_attr')
                .annotate(my_attr=F('name'))
                .select_related('run')
        )

The best approach here is to either join Run + RunAttribute or append RunArtributes you are interested in to Run as extra fields, but what is the best way to make it using Django as much as possible and have minimal Raw SQL (if at all). And, if to resort to RawSQL, then reuse SQL generated by Django as much as possible.

Without seeing the models involved here, I’m not sure I understand exactly what you’re trying to achieve.

It appears that RunAttribute may be an EAV-type table with (at least) two fields, a name and a value.

Run is effectively the through table of a ManyToMany relationship between RunAttribute and TestRun.

It appears to me that what you’re trying to is:

Get a list of TestRun objects

  • where it has a related RunAttribute object with the name ‘my_attr’
  • access the ‘value’

Assuming that you have the ManyToMany relationship between TestRun and RunAttribute defined as a field, I’ll call it run_attributes, you should be able to do something like:
TestRun.objects.filter(run_attributes__name='my_attr') .annotate(my_attr=F('run_attributes__value'))

Addendum:

This is a completely different issue - if you’re looking to retrieve multiple values, this breaks the effective one-to-one row correlation between TestRun and RunAttributes. In this case, you would be wanting to use the prefetch_related function to associate the multiple rows of RunAttributes with each row of TestRun.