Django queryset filter with Rank

Hi!

I need to write a queryset which filters 10 higher values and 10 lower values than a given value.

class Toy(models.Model):
    name = models.CharField()
    price = models.IntegerField()

With given price 56, I want to get a list of 10 toys whose prices are higher than the given value and most close to the given value and the same for the lower values.
Finally, I want to include the toy with given price as well.

The problem is I need to get the ranks of those selected items as well.
So I used an annotation like this.

toys = Toys.objects.filter(on_sale=True, store_id__in=sid). \
        annotate(rank=Window(expression=Rank(), order_by=F('price').desc()))
high = toys.filter(price__gte=score).order_by('score', 'created')[:count]
lower = toys.filter(price__lt=score).order_by('-score', 'created')

...

selected_toys = [
        {
            'rank': toy.rank,
            'price': toy.price,
            'store': toy.store_id,
        } for toy in high
    ]

The rank is calculated only for filtered items but I need the ranks for all the toys in that given store id

How can I get the correct ranks after filtering higher values and lower values?

Thanks in advance.

Where does “scores” come from?

If the scores variable is also a QuerySet of Toys, can’t you merge the querysets together and only then add the annotation?

Sorry, they are toys not scores
My typo

So, is the rank anything else than the position in the ordered queryset?

I can say it’s basically order number based on the price value

Then why don’t you just use the position in the list to determine rank?

You mean the row number? That’s not an exact rank.
Rank is not the same as order.

But you just sais it’s the order number based on price, so what prevents you from ordering by price and looking at the row number?

Just throwing something off the top of my head, but can you filter by store, then annotate by rank and then filter by “on_sale”? (See Order of annotate and filter clauses)
Does that give you the logic that you’re looking for? Or am I misunderstanding your desired results?

It’s numerical rank, not order. Same rank can be assigned to multiple objects if they have same price but order is unique even though they have same prices.

1 Like

So just to check your intent, your desired output is

  • 20 toys (i.e. 10 higher and 10 lower), or
  • 20 toys per store_id?

And the rank you want to include should be:

  • ranking based on price across a store, on_sale only (as in your example code) ?
  • or, across all stores?
  • or, within same store, but also across toys not on_sale?

This might not be exactly what you need, but hopefully it gives you sufficient ideas to adjust to your requirements.

Interpreting it as:

  • You want to get 20 rows per store (i.e. 10 above a given value, and 10 below a given value, for each store),
  • together with a price rank for all toys in that store

You could potentially do something like the below.
It’s not pretty, and depending on your need (eg. do you need to store results of this back into the database or not) it might actually be easier manipulating in pandas or using Raw SQL (because typically you’d write this with subqueries in SQL), but it can be done in Django ORM as well.

from django.db.models import ExpressionWrapper, F, Func, Q, Window
from django.db.models.functions import Abs, Rank, RowNumber

given_price = 56

above_threshold_q = Q(price__gte=given_price)  # including threshold price

filter_10_both = (  # criteria for selecting 10 rows on both sides
	(Q(is_above_threshold=True) & Q(row_vs_threshold__lte=10) )
	| Q(is_above_threshold=False) & Q(row_vs_threshold__lte=10) )

toys = Toy.objects.annotate(
	# RANK() by price within each store_id
	rank=Window(
		expression=Rank(), 
		partition_by=[F('store_id')],
		order_by=F('price').desc()),
	# let's add a boolean field to indicate rows above the threshold
	is_above_threshold=ExpressionWrapper(above_threshold_q, output_field=models.BooleanField()),
	# absolute difference with threshold
	price_vs_threshold=Func(F('price') - given_price, function='ABS'),
	# row number below/above threshold, for each store_id
	row_vs_threshold=Window(
		expression=RowNumber(), 
		partition_by=[F('store_id'), F('is_above_threshold')],
		order_by=F('price_vs_threshold').asc()),			
	# add boolean for 10 rows on both sides of threshold
	selected=ExpressionWrapper(filter_10_both, output_field=models.BooleanField()))	

# get a dictionary as we can not filter() after Window() expressions
toys_selected = [toy for toy in toys.values() if toy['selected']]

# let's have a look:
for toy in toys_selected:
	print (f"{toy['name']} | Store ID #{toy['store_id']} | "
	       f"Price {toy['price']} | Rank {toy['rank']}")
2 Likes

Hi, @awtimmering

Thank you for your help.
I actually did that as a first try.

toys = Toy.objects.filter(on_sale=True, store_id__in=store_ids). \
        annotate(rank=Window(expression=Rank(), order_by=F('price').desc()))
higher_toys = toys.filter(price__gte=value).order_by('price', 'created')[:count]
lower_toys = toys.filter(price__lt=value).order_by('-price', 'created')

But the problem is if I filter it out with two querysets, the rank is not applied for total toys but only for those filtered toys respectively.
So in both high and low toys, the rank all starts from 1 which is incorrect.