Find objects with mix Distinct and Order_by

Dear Django Peoples,

I had a small problems with my filters :
I am trying to add the “distinct” function to my sentence, but it seems not working when it is mix with the order_by filter.
This is working :
my_objects_list = Peoples_Push.objects.all().filter(mail__length__lt = 100).order_by(‘-the_date’)
my_objects_list = Peoples_Push.objects.all().distinct(‘mail’)

But this is not working :
my_objects_list = Peoples_Push.objects.all().distinct(‘mail’).order_by(‘-the_date’)

And the error is this one :
Exception Value:
SELECT DISTINCT ON expressions must match initial ORDER BY expressions
LINE 1: SELECT COUNT(*) FROM (SELECT DISTINCT ON ("peoples_peoples_…

Do you have an idea to how to do that ?

Thanks for your help

Pierre

What do you think that this error message is telling you?

Also, see the docs for distinct() at QuerySet API reference | Django documentation | Django

I am using PostGreSQL … so it doesn’t work :(((

Thanks for your help Ken

What do you mean? The examples in that section states that they all work on PostgreSQL.

Sorry I am french and my english is not very good.
I found the problem, the item in the DISTINCT and the ORDER_BY mus t be the same
this is not possible :
.order_by(’-the_date’).distinct(‘the_mail’)

They do not need to be the same.

The restriction is that everything in the distinct must also be in the order_by. However, you can have additional fields in the order_by that are not listed in the distinct.

You are correct, what you specifically listed is not valid.

However, .order_by('the_mail', '-the_date').distinct('the_mail') would be.

Yes, I try this sentence.
But I need to display the objects sort first by the_date and not the_mail …

You can use a Subquery for this. You can write your filter / distinct as a subquery, returning the value of the rows to be used. The outer query can then filter on that set and order by the date field.

Thanks for your help, I will try this

In case anyone bumps into this issue and is concerned about optimal solution, I found a way around it with Window Functions. You can read here

This is how to solve this that won’t result into 2 db hits (via subsequent filters). Note that this might only work with postgresql.

from django.db.models import F, Window
from django.db.models.functions import RowNumber

base_queryset = Peoples_Push.objects.all()

# Annotate row number per mail group, ordering by latest the_date first
latest_per_mail_object = base_queryset.annotate(
    row_number=Window(
        expression=RowNumber(),
        partition_by=[F("mail")],
        order_by=F("the_date").desc()
    )
).filter(row_number=1)

my_objects_list = latest_per_mail_object.order_by("-the_date")

This will create window partition that will be based on distinct “mail” grouping. Each group will be ordered by the_date in descending order and then the latest (row_number = 1) is retrieved per group. After that, you can order the resulting quesryset however you want it again then use the result.