how to using `order_by` from relate model combine with `distinct`

kev26 · November 14, 2022, 2:45pm

When I get API with ordering by product__company__name, the API responds with the result as expected, but with 2 records duplicated. I read the Note in this docs https://docs.djangoproject.com/en/3.2/ref/models/querysets/#distinct that
If you order by fields from a related model, those fields will be added to the selected columns and they may make otherwise duplicate rows appear to be distinct. Since the extra columns don’t appear in the returned results (they are only there to support ordering), it sometimes looks like non-distinct results are being returned.
So how can I resolve this problems ? I just want ordering product base on company name.
My code example:

    def get_queryset(self):
            qs = MyModel.objects.filter(
                   <logic_code>
            )
            return qs.distinct().order_by("product__company__name")

KenWhitesell · November 15, 2022, 3:04am

It might be helpful if you posted the models involved.

What database engine are you using?

Are you needing all the fields in MyModel? Or are you just working with some of them in this situation?

Depending on these factors, you may need to perform the “deduplication” of data yourself within your code - the ORM may not have a facility available to limit this directly in a query.

kev26 · November 15, 2022, 3:59am

I’m using PostgreSQL
So I want to order by specific cases and in my case, I want to sorting book by author name.

KenWhitesell · November 15, 2022, 2:04pm

Since you’re using PostgreSQL, you might be able to use the distinct function with specifying the fields that you want to use. For example, if you just want distinct entries of DefaultShelf, you may be able to use .distinct('id'). (I’m not sure if that’s going to work if you’re not including it in the ordering of those rows. If it doesn’t work, then it may be up to you do manage this in code.)

kev26 · November 15, 2022, 3:46pm

I can’t use distinct(‘id’) because SELECT DISTINCT ON expressions must match initial ORDER BY expressions from docs: QuerySet API reference | Django documentation | Django. I also tried with qs.distinct("product__company__name").order_by("product__company__name") and the result still has the duplicate record.

KenWhitesell · November 15, 2022, 4:05pm

I kinda thought that might be the case.

I think you’re going to need to write the code to do this then, I can’t think of any ORM facility to help you do it.

kev26 · November 16, 2022, 7:30am

Thank you. I read Aggregation in the docs, but seem no way to using annotate to order with alphabet.

KenWhitesell · November 16, 2022, 1:29pm

I was thinking of approaching this from the other direction - keep your query as you have it written, then perform the “deduplication” on the resulting queryset.

kev26 · November 17, 2022, 1:28am

Thank you so much. Could you explain more about this? I really don’t know a way to resolve deduplication except to use distinct(). In my case, the respond API still count=2 as me as expected, but with 3 records in the result with 2 records duplicated.

KenWhitesell · November 17, 2022, 1:38am

While the specifics are going to depend upon precisely what you’re doing in the query and in the view, the general idea is that you write code to examine the queryset to determine which elements of the queryset are duplicates of other elements. There are a couple different ways to do this - you can either examine the queryset before you iterate over it for your template, or you write code that tracks which rows you are processing in your template, and skip any that you have already seen.

If you need more information, a web search for “removing duplicates from a list in python” (or some such similar phrasing) should help you locate any number of examples.

kev26 · November 17, 2022, 1:49am

ya, I understand, you mean to convert it to a list and remove duplicates on it! Thanks for patiently helping me !

Topic		Replies	Views
Django ORM `queryset.distinct()` gets duplicate values due to `.ordered_by()` on the column Using the ORM	2	5200	July 10, 2023
Find objects with mix Distinct and Order_by Using the ORM	8	3684	April 5, 2022
At a loss with ordering by one field and making distinct on another Using Django	3	928	August 10, 2020
How to select from select distinct ? Forms & APIs	2	1067	September 25, 2022
Possible to use distinct() with attributes created via extra() in Django ORM? Using Django	2	932	September 6, 2020

how to using `order_by` from relate model combine with `distinct`

Related topics