Append extra objects to a reverse ForeignKey queryset

Hi all,

I’m porting an app across from Ruby on Rails, so I’m trying to work around the existing model/database structure that I can’t alter, and have come across an interesting conundrum.

Consider the following skeleton model structure:

class Requisition(models.Model):
    ...

class Item(models.Model):
    requisition_id = models.ForeignKey(Requisition, null=False)
    ...

class Event(models.Model):
    parent_item = models.ForeignKey(Item, null=True)
    parent_req = models.ForeignKey(Requisition, null=True)
    ...

    def clean(self):
        # Guarantee exactly one of parent_item and parent_req are populated

Therefore, an Event can be attached to either an Item or a Requisition, but not both simultaneously, and not neither. Items always have a parent Requisition.

For any given Item, the relevant Events are those which are attached to both the Item, and its parent Requisition (via the requisition_id foreign key). In standard query terms, this can be accessed a couple of ways, e.g.:

i = Item.objects.select_an_item_somehow()

# Filter Event object queryset
Event.objects.filter(Q(parent_item=i) | Q(parent_req=i.requisition_id ))

# Do a union of related manager querysets
i.event_set.union(i.requisition_id.event_set)  # Recall no duplicates due to Event.clean()

However, this leads to a large number of database queries, especially if I need to do this for every Item I get back in some larger collection of those objects.

I need to do a fair bit of analysis/arithmetic using the Event objects relevant to any given Item (and often, for a large number of Items all at once), so accessing this set easily is important.

What I’d really like to do is give the Item queryset the ability to give me the relevant Event queryset back as an integral part of the Item queryset (i.e., have the relevant queryset of Events available as, say, Item.relevant_event). This strikes me as the most efficient way to handle this problem, in a vainglorious attempt to avoid hammering the database with multiple queries.

I’ve thought of a couple of different ways to attempt this, but haven’t had any luck so far:

  1. Use annotate. Within the ItemQuerySet, I’ve tried to add an additional helper function that adds the relevant Event queryset into the Item queryset via annotate. However, I’ve not managed to get the right arrangement of Subquery, Q filtering, etc, to make this work. I’ve also realized I have no clue what field type this should return, so I’m not at all sure this is technically possible. My current attempt (which spits out a single number, so is obviously giving me nonsensical garbage) is something like:
class ItemQuerySet(models.QuerySet):
    def compute_relevant_events(self):
        # Attempts at prefetch, etc. optimization omitted
        return self.annotate(
            relevant_event=Subquery(
                    Event.objects.filter(
                        Q(parent_item=OuterRef("pk"))
                        | Q(parent_req=OuterRef("requisition_id__pk"))
                    ).values("pk")
                )
        )
  1. Alter the reverse relationship manager. After spending far too long attempting the above, I realized it may be quicker to alter whatever manages the reverse lookup of Item.event_set to either:
    • Automatically add the Event objects from the parent Requisition.event_set into the return of Item.event_set; or,
    • Define a new reverse relationship call, say Item.all_event_set, that provides the same return.

To expand this idea, I presume that the SQL that is behind Item.event_set just does:

SELECT * FROM event WHERE event.parent_item=Item.pk

I’m wondering how I can alter that SQL (via Django magic) so that it becomes:

SELECT * FROM event WHERE event.parent_item=Item.pk OR event.parent_req=Item.requisition_id

Having pored through the documentation, I still can’t find if this is possible, or how to go about it. The documentation talks about using a custom reverse manager, but unless I’ve missed something obvious, it doesn’t say if you can apply this automatically to a model (as opposed to invoking it for specific queries), nor what these custom reverse managers look like in practice.

If you’ve made it this far, thanks for reading my tome. Has anyone encountered this sort of problem before, and if so, did you manage to solve it?