Help crafting Django ORMese

Hi, I’m looking for help with some ORMese for something that would be fairly straightforward in plain SQL. Given the following models…

class Event(models.Model):
    created_on = models.DateTimeField(auto_created=True)
    name = models.TextField()
​
class EventData(models.Model):
    event = models.ForeignKey(Event, on_delete=models.CASCADE)
    start = models.DateTimeField()
    revision = models.IntegerField()

…how can I write a Django ORM query that gives me a queryset of EventDatas in which, for each event_id, only the row with the highest revision is selected?

So say I have a DB that looks like:

Events

id | created_on  | name
---+-------------+-----------------
1  | 17 Jan 2020 | Collider powering up
2  | 20 Jan 2020 | Fermion shortage


EventData

id | event_id | start       | revision
--------------+-------------+---------------------
1  | 1        |  1 Feb 2020 | 0
2  | 1        |  2 Feb 2020 | 1
3  | 2        | 25 Jan 2020 | 0
4  | 2        | 26 Jan 2020 | 1
5  | 1        |  3 Mar 2020 | 2

the query should result in selecting the EventData with IDs 4 and 5.

Any help gratefully appreciated :slight_smile: I feel like this should be pretty simple but I can’t figure it out myself.

Not sure if this is the most efficient, but something like:

EventData.objects.order_by('event', '-revision').distinct('event')

I’m pretty sure if you pass a field name to distinct(), it will only evaluate that field for uniqueness. You’ll end up with the first EventData for each event with the highest revision (because of the ordering).

You could also use a Subquery to annotate EventData with the Max(‘revision’) for each event_id and then filter for the ones where revision equals that value.

1 Like