Hi
I am having a moderate performance issue due to the way prefetch_related
works, which could be fixed given some domain constraints the query planner is (rightfully) unaware of, however I don’t know how to implement this in Django.
A basic pseudo-representation of my models:
class Device(Model):
pass
class Interface(Model):
device = models.ForeignKey('Device')
parent = models.ForeignKey('Interface')
aggregation = models.ForeignKey('Interface')
connected_to = models.ForeignKey('Interface')
If I run something like this
Device.objects.all().prefetch_related("interface", "interface__parent", "interface__aggregation", "interface__parent__aggregation") # ... there are more relationships and levels in this format
I end up with many INNER JOINs on the Interface table which ends up taking a “very significant” amount of time.
However, there’s a constraint I could apply: All relationships from Interface
map out to one of the items in its Device’s interfaces, concretely, something like this holds:
i = Interface.objects.get(..)
assert i.device == i.parent.device
assert i.device == i.aggregation.device
My question is: is there a way I can tell this to the prefetcher?
I’ve been looking at the internals of QuerySet
and effectively if I could say for every object in each level of the queryset:
obj._prefetched_objects_cache["parent"] = device.interfaces
obj._prefetched_objects_cache["aggregation"] = device.interfaces
...
would be perfect (thus only joining once, with interfaces
)