Aggregating a windowed queryset using the ORM

The raw queries documentation says:

Before using raw SQL, explore the ORM. Ask on one of the support channels to see if the ORM supports your use case.

Now, I’m tempted to use raw sql, because I have not been able to figure out how to reproduce the following query (or something similar) using the Django ORM:

SELECT id, student_id, AVG(normalized_value) AS mean_normalized_score
FROM (
    SELECT
        s.*,
        ((s.value - MIN(s.value) OVER (PARTITION BY s.question_id)) / (MAX(s.value) OVER (PARTITION BY s.question_id) - MIN(s.value) OVER (PARTITION BY s.question_id)))
        AS normalized_value
    FROM myapp_score AS s
    ) 
GROUP BY student_id

Could someone tell me if this is possible using the ORM?
And, if so, how?

Here’s a simplified models.py:

class Question(models.Model):
    pass

class Student(models.Model):
    pass

class Score(models.Model):
    student = models.ForeignKey(
        to=Student, on_delete=models.CASCADE)
    question = models.ForeignKey(
        to=Question, on_delete=models.CASCADE)
    value = models.FloatField()

My closest attempt is something like this, but this does not yield the desired result (the grouping for the Avg is not right):

w_min = Window(expression=Min('value'), partition_by=[F('question')])
w_max = Window(expression=Max('value'), partition_by=[F('question')])
annotated_scores = Score.objects.annotate(
    normalized_value=(F('value') - w_min) / (w_max - w_min))
subquery = Subquery(annotated_scores.values('normalized_value'))
Score.objects.values('student_id').annotate(
    mean_normalized_score=Avg(subquery))

I posted a more elaborate question on stackoverflow, but that has not received any answers yet.

I think what it boils down to is: How to put a subquery inside a FROM clause using the Django ORM?

I don’t think this is possible outside of using RawSQL() - see #28333 (Filter and subquery for window expressions) – Django

1 Like

@speno: Thanks for pointing that out. :slight_smile:
I’ll keep an eye on that issue.

While reading Using PostgreSQL Views in Django today, I though you might also be able to solve these kinds of DB queries by using a view table, like the ones described in the link.

I do note that you can do these kinds of queries in the SQLalchemy ORM. When I was looking into this I found this example someplace (and the syntax may not be right here as I’m not a user of SQLalchemy.)

sub_query = session.query(
	Stat, func.rank().over(
		partition_by=Stat.report_id, order_by=Stat.end.desc()
	).label('stat_rank')
).subquery('dept_rank')

session.query(sub_query).filter(
	sub_query.c.stat_rank <= 12
).all()

Potentially interesting for this thread: GSoC 2022 Proposal Feedback - Window expressions improvements