I’m trying to use Django ORM to access data of the DB I do not control on RO mode.
Tables look this way:
RunAttribute -> Run <- TestRun - TestMetrics
-> - many to one
- one to one
In RunAttribute I’m interested only in a single attribute, so it is one-to-many relationship that for my case is in fact one-to-zero-or-one. But the problem holds true even if I would need to get multiple attributes - to treat them as extra fields of Run model.
First I tried to use subqueries:
my_attr = RunAttribute.objects.filter(
run=OuterRef('run'),
name='my_attr',
)[:1]
qs = (
TestRun.objects.select_related('run', 'test_metrics')
.annotate(my_attr=labels_subquery.values('value'))
.filter(
my_attr__isnull=False,
# some other run__attr= filters
)
)
The problem is that it results in SELECT field1, ..., fieldN, SELECT (...) FROM ... - and it resembles the problem from this thread - how to avoid N+1 queries then?
I also tried to override get_queryset() of Run model to base it on RunAttribute model instead, but it looks like it doesn’t work - it is not used by TestRun.objects.select_related('run') and, moreover,
filtering by run__my_addr__isnull=False doesn’t work:
class RunManager(models.Manager):
def get_queryset(self):
return (
RunAttribute.objects
.filter(name='my_attr')
.annotate(my_attr=F('name'))
.select_related('run')
)
The best approach here is to either join Run + RunAttribute or append RunArtributes you are interested in to Run as extra fields, but what is the best way to make it using Django as much as possible and have minimal Raw SQL (if at all). And, if to resort to RawSQL, then reuse SQL generated by Django as much as possible.