Django queryset filter with Rank

devspectre · March 31, 2021, 12:27am

Hi!

I need to write a queryset which filters 10 higher values and 10 lower values than a given value.

class Toy(models.Model):
    name = models.CharField()
    price = models.IntegerField()

With given price 56, I want to get a list of 10 toys whose prices are higher than the given value and most close to the given value and the same for the lower values.
Finally, I want to include the toy with given price as well.

The problem is I need to get the ranks of those selected items as well.
So I used an annotation like this.

toys = Toys.objects.filter(on_sale=True, store_id__in=sid). \
        annotate(rank=Window(expression=Rank(), order_by=F('price').desc()))
high = toys.filter(price__gte=score).order_by('score', 'created')[:count]
lower = toys.filter(price__lt=score).order_by('-score', 'created')

...

selected_toys = [
        {
            'rank': toy.rank,
            'price': toy.price,
            'store': toy.store_id,
        } for toy in high
    ]

The rank is calculated only for filtered items but I need the ranks for all the toys in that given store id

How can I get the correct ranks after filtering higher values and lower values?

Thanks in advance.

Jefwillems · March 31, 2021, 11:53am

Where does “scores” come from?

If the scores variable is also a QuerySet of Toys, can’t you merge the querysets together and only then add the annotation?

devspectre · March 31, 2021, 11:55am

Sorry, they are toys not scores
My typo

Jefwillems · March 31, 2021, 12:00pm

So, is the rank anything else than the position in the ordered queryset?

devspectre · March 31, 2021, 12:02pm

I can say it’s basically order number based on the price value

Jefwillems · March 31, 2021, 12:08pm

Then why don’t you just use the position in the list to determine rank?

devspectre · March 31, 2021, 12:41pm

You mean the row number? That’s not an exact rank.
Rank is not the same as order.

Jefwillems · March 31, 2021, 1:39pm

But you just sais it’s the order number based on price, so what prevents you from ordering by price and looking at the row number?

KenWhitesell · March 31, 2021, 2:05pm

Just throwing something off the top of my head, but can you filter by store, then annotate by rank and then filter by “on_sale”? (See Order of annotate and filter clauses)
Does that give you the logic that you’re looking for? Or am I misunderstanding your desired results?

devspectre · March 31, 2021, 7:32pm

It’s numerical rank, not order. Same rank can be assigned to multiple objects if they have same price but order is unique even though they have same prices.

awtimmering · April 1, 2021, 5:04am

So just to check your intent, your desired output is

20 toys (i.e. 10 higher and 10 lower), or
20 toys per store_id?

And the rank you want to include should be:

ranking based on price across a store, on_sale only (as in your example code) ?
or, across all stores?
or, within same store, but also across toys not on_sale?

awtimmering · April 1, 2021, 8:23am

This might not be exactly what you need, but hopefully it gives you sufficient ideas to adjust to your requirements.

Interpreting it as:

You want to get 20 rows per store (i.e. 10 above a given value, and 10 below a given value, for each store),
together with a price rank for all toys in that store

You could potentially do something like the below.
It’s not pretty, and depending on your need (eg. do you need to store results of this back into the database or not) it might actually be easier manipulating in pandas or using Raw SQL (because typically you’d write this with subqueries in SQL), but it can be done in Django ORM as well.

from django.db.models import ExpressionWrapper, F, Func, Q, Window
from django.db.models.functions import Abs, Rank, RowNumber

given_price = 56

above_threshold_q = Q(price__gte=given_price)  # including threshold price

filter_10_both = (  # criteria for selecting 10 rows on both sides
	(Q(is_above_threshold=True) & Q(row_vs_threshold__lte=10) )
	| Q(is_above_threshold=False) & Q(row_vs_threshold__lte=10) )

toys = Toy.objects.annotate(
	# RANK() by price within each store_id
	rank=Window(
		expression=Rank(), 
		partition_by=[F('store_id')],
		order_by=F('price').desc()),
	# let's add a boolean field to indicate rows above the threshold
	is_above_threshold=ExpressionWrapper(above_threshold_q, output_field=models.BooleanField()),
	# absolute difference with threshold
	price_vs_threshold=Func(F('price') - given_price, function='ABS'),
	# row number below/above threshold, for each store_id
	row_vs_threshold=Window(
		expression=RowNumber(), 
		partition_by=[F('store_id'), F('is_above_threshold')],
		order_by=F('price_vs_threshold').asc()),			
	# add boolean for 10 rows on both sides of threshold
	selected=ExpressionWrapper(filter_10_both, output_field=models.BooleanField()))	

# get a dictionary as we can not filter() after Window() expressions
toys_selected = [toy for toy in toys.values() if toy['selected']]

# let's have a look:
for toy in toys_selected:
	print (f"{toy['name']} | Store ID #{toy['store_id']} | "
	       f"Price {toy['price']} | Rank {toy['rank']}")

devspectre · April 2, 2021, 6:20pm

Hi, @awtimmering

Thank you for your help.
I actually did that as a first try.

toys = Toy.objects.filter(on_sale=True, store_id__in=store_ids). \
        annotate(rank=Window(expression=Rank(), order_by=F('price').desc()))
higher_toys = toys.filter(price__gte=value).order_by('price', 'created')[:count]
lower_toys = toys.filter(price__lt=value).order_by('-price', 'created')

But the problem is if I filter it out with two querysets, the rank is not applied for total toys but only for those filtered toys respectively.
So in both high and low toys, the rank all starts from 1 which is incorrect.

tonyhoangtran92 · November 3, 2022, 8:55am

I got the same problem and looking for a solution.
Can u let me know how to solve your issue?

Topic		Replies	Views
Django ORM queryset, include index in values_list Using Django	2	2276	March 14, 2021
Django queryset filter problem Using Django	3	615	April 9, 2021
Django queryset chain annotate and distinct together Using Django	13	12473	August 4, 2021
How to Rank Students Based on Total Points in Django QuerySet When Filtering Results Using the ORM	3	60	August 19, 2024
Django database function - Window, expression argument Using Django	4	993	March 14, 2021

Django queryset filter with Rank

Related topics