Optimizing prefetch_related for "repeated" values

Hi
I am having a moderate performance issue due to the way prefetch_related works, which could be fixed given some domain constraints the query planner is (rightfully) unaware of, however I don’t know how to implement this in Django.

A basic pseudo-representation of my models:

class Device(Model):
    pass

class Interface(Model):
    device = models.ForeignKey('Device')
    parent = models.ForeignKey('Interface')
    aggregation = models.ForeignKey('Interface')
    connected_to = models.ForeignKey('Interface')

If I run something like this

Device.objects.all().prefetch_related("interface", "interface__parent", "interface__aggregation", "interface__parent__aggregation") # ... there are more relationships and levels in this format

I end up with many INNER JOINs on the Interface table which ends up taking a “very significant” amount of time.

However, there’s a constraint I could apply: All relationships from Interface map out to one of the items in its Device’s interfaces, concretely, something like this holds:

i = Interface.objects.get(..)
assert i.device == i.parent.device
assert i.device == i.aggregation.device

My question is: is there a way I can tell this to the prefetcher?
I’ve been looking at the internals of QuerySet and effectively if I could say for every object in each level of the queryset:

obj._prefetched_objects_cache["parent"] = device.interfaces
obj._prefetched_objects_cache["aggregation"] = device.interfaces
...

would be perfect (thus only joining once, with interfaces)

You might be able to get some benefit from using the Prefetch object. See the examples in the prefetch_related docs for some ideas.

Hey,
the Prefetch object does work, and almost as I’d like it to (or I’m using it wrong)
I ended up writing this:

ifaces = Interface.objects.filter(device__in=queryset).select_related('connected_to')
ifaces = ifaces.prefetch_related(
              Prefetch("parent", queryset=ifaces),
              Prefetch("aggregation", queryset=ifaces),
              Prefetch("bundle_members", queryset=ifaces),
              ...
          )

However what this does is run a query for each prefetch_related line, selecting every interface by id:

SELECT * -- for brevity
  FROM "networks_interface"
  LEFT OUTER JOIN "networks_interface" T4
    ON ("networks_interface"."connected_to_id" = T4."id")
 WHERE ... "networks_interface"."aggregation_id" IN (literally every single id in "ifaces")

what I’d love to be able to do, is have the internal look-ups avoid an extra fetch and instead look up the ids on the same dictionary.

Something like this:

for p in page:                                                                                   
   p._prefetched_objects_cache = {}
   p._prefetched_objects_cache["parent"] = ifaces           
   p._prefetched_objects_cache["aggregation"] = ifaces          
   p._prefetched_objects_cache["bundle_members"] = ifaces         
   p._prefetched_objects_cache["subinterfaces"] = ifaces

Or, when stepping through the queryset prefetching code, I’d also override the reverse many to one manager