Dear Django Peoples,
I had a small problems with my filters :
I am trying to add the “distinct” function to my sentence, but it seems not working when it is mix with the order_by filter.
This is working :
my_objects_list = Peoples_Push.objects.all().filter(mail__length__lt = 100).order_by(‘-the_date’)
my_objects_list = Peoples_Push.objects.all().distinct(‘mail’)
But this is not working :
my_objects_list = Peoples_Push.objects.all().distinct(‘mail’).order_by(‘-the_date’)
And the error is this one :
Exception Value:
SELECT DISTINCT ON expressions must match initial ORDER BY expressions
LINE 1: SELECT COUNT(*) FROM (SELECT DISTINCT ON ("peoples_peoples_…
Do you have an idea to how to do that ?
Thanks for your help
Pierre
What do you think that this error message is telling you?
Also, see the docs for distinct()
at QuerySet API reference | Django documentation | Django
I am using PostGreSQL … so it doesn’t work :(((
Thanks for your help Ken
What do you mean? The examples in that section states that they all work on PostgreSQL.
Sorry I am french and my english is not very good.
I found the problem, the item in the DISTINCT and the ORDER_BY mus t be the same
this is not possible :
.order_by(’-the_date’).distinct(‘the_mail’)
They do not need to be the same.
The restriction is that everything in the distinct must also be in the order_by. However, you can have additional fields in the order_by that are not listed in the distinct.
You are correct, what you specifically listed is not valid.
However, .order_by('the_mail', '-the_date').distinct('the_mail')
would be.
Yes, I try this sentence.
But I need to display the objects sort first by the_date and not the_mail …
You can use a Subquery for this. You can write your filter / distinct as a subquery, returning the value of the rows to be used. The outer query can then filter on that set and order by the date field.
Thanks for your help, I will try this
In case anyone bumps into this issue and is concerned about optimal solution, I found a way around it with Window Functions. You can read here
This is how to solve this that won’t result into 2 db hits (via subsequent filters). Note that this might only work with postgresql.
from django.db.models import F, Window
from django.db.models.functions import RowNumber
base_queryset = Peoples_Push.objects.all()
# Annotate row number per mail group, ordering by latest the_date first
latest_per_mail_object = base_queryset.annotate(
row_number=Window(
expression=RowNumber(),
partition_by=[F("mail")],
order_by=F("the_date").desc()
)
).filter(row_number=1)
my_objects_list = latest_per_mail_object.order_by("-the_date")
This will create window partition that will be based on distinct “mail” grouping. Each group will be ordered by the_date in descending order and then the latest (row_number = 1) is retrieved per group. After that, you can order the resulting quesryset however you want it again then use the result.