To make the long story short, we have a QuerySet of transactions of the same product_id
, ordered by the primary key id
.
qs_transactions = Transaction.active_objects.filter(product_id=fixed_value).order_by('id')
...
def function(qs_transactions)
qs_product_id = qs_transactions.values('product_id').distinct()
...
However, when retrieving the distinct product_id
, the query set returns duplications. It seems that the .ordered_by(id)
forced the query set to include column id
, so the distinct query got the same product_id
with different id
’s.
SELECT DISTINCT
`app_transaction`.`product_id`
, `app_transaction`.`id`
FROM `app_transaction`
WHERE
... conditions ...
ORDER BY `app_transaction`.`id` ASC
We tried with SQL, and if remove app_transaction.id
and leave everything else the same, the query returns the distinct product_id
as expected. We don’t know how to do this with ORM, though.
SELECT DISTINCT
`app_transaction`.`product_id`
FROM `app_transaction`
WHERE
... conditions ...
ORDER BY `app_transaction`.`id` ASC
Our Question:
The Python program appended the .ordered_by()
clause to the query set, and then passed it as an argument to a function call. And we are trying to get the distinct value in the function.
We wonder how to get the expected distinct value.
Hello there!
I assume you are using MySQL here due to the backtip quoting of your identifiers?
I’m not sure which version of MySQL you are running but when I try running your adjusted query on MySQL 8 I get this very explicit message
CREATE TABLE app_transaction (
id int NOT NULL AUTO_INCREMENT,
product_id int NOT NULL,
PRIMARY KEY (id)
);
SELECT DISTINCT product_id FROM app_transaction ORDER BY id;
Expression #1 of ORDER BY clause is not in SELECT list, references column 'app_transaction.id' which is not in SELECT list; this is incompatible with DISTINCT
The long story short is that most SQL backends don’t allow ORDER BY
fields that are not selected when using DISTINCT
(good article on the topic) so Django implicitly add ordering columns when this is the case.
This has been a known footgun for a while and has been extensively discussed in the past but we haven’t proceeded with a solution yet.
In an ideal world we’d first start by erroring out explicitly when trying to do such thing and eventually add support for the ORM to do the subquery pushdown automatically
SELECT DISTINCT product_id
FROM (
SELECT product_id, id
FROM app_transaction
ORDER BY id
) subquery
Hope that helps!
1 Like
Hi, @charettes ,
Thank you for your input.
The database is 10.5.21-MariaDB
, and we just tried again, SELECT DISTINCT product_id FROM app_transaction ORDER BY id;
works in this version.
For the others, yes, we agree with you. And thank you again for your help.