Problem with performing a groupby count using an annotated field

akhsiM · October 27, 2021, 4:45am

Context

I have a table/model called Article.

Within this model, I have several fields such as expiry date, visibility, and status. These fields determine if the Article should be displayed or not. Essentially, the desired logic is like so:

display = True
if status == 'Private'; display = False
if visibility == False; display = False
...

For the sake of simplicity, in this post, I’m just going to be using one field: status. The status field is a CharField with choices being 'Private' or 'Public'.

In order to work out the display logic, I use annotation. This is relatively simple:

all_articles = Article.objects.annotate(
    display=Case(When(status='Private', then=Value(False)), default=Value(True), output_field=models.BooleanField())
)

displayed_articles = all_articles.filter(display=True)
notdisplayed_articles = all_articles.filter(display=False)

Goal

With this setup, I would like to use Django ORM to perform a count with group by to determine how many articles are being displayed, and how many are not.

The outcome would like like this in SQL tabular format:

display	count
True	500
False	2000

Problem

This is the natural way to achieve my goal:

queryset = Article.objects.annotate(
    display=Case(
        When(status='Private', then=Value(False)), default=Value(True), output_field=models.BooleanField()
    )
).values('display').annotate(count=Count('id')).order_by()

print(queryset)

Expectation

I’m expecting something like this:

<QuerySet [{'display': True, 'count': 500}, {'display': False, 'count': 2000}]>

Error

However, I’m getting this:

django.db.utils.ProgrammingError: ('42000', "[42000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Column 'blog_article.status' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. (8120) (SQLExecDirectW)")

Why is this happening and how can I get around this?

Investigation into the SQL Query

I have looked into the generated SQL Query by:

print(queryset.query)

SELECT CASE
        WHEN [blog_article].[status] = Private THEN False
        ELSE True
    END AS [display],
    COUNT_BIG([blog_article].[id]) AS [count]
FROM [blog_article]
GROUP BY CASE
        WHEN [blog_article].[status] = Private THEN False
        ELSE True
    END

The general structure of the query looks OK, I only needed to make some small changes to it. The below query was executed successfully against the database.

SELECT CASE
        WHEN [faq_faq].[status_type] = 'Private' THEN 0
        ELSE 1
    END AS [display],
    COUNT_BIG([faq_faq].[id]) AS [count]
FROM [faq_faq]
GROUP BY CASE
        WHEN [faq_faq].[status_type] = 'Private' THEN 0
        ELSE 1
    END

I have also tested running this query using pyodbc, and it was executed successfully.

CodenameTim · October 27, 2021, 8:08pm

Hi akhsiM, in your last two queries, the tables change. Can you use a single model and table throughout your question (please also re-verify your assertions using that table and model too to be safe)? That will eliminate the possibility of a copy/paste mistake, false-positives/false-negatives when testing things.

KenWhitesell · October 27, 2021, 8:15pm

I’ve verified that the Django ORM query you’ve written works fine on the Sqlite engine. Can you test this against a different database? (One of the ones supplied by Django)

Also, which database engine are you using?

It’s possible that this error is caused by the database engine, in which case you should raise this issue with a group supporting that engine. (From the error message it looks like you’re using a SQLServer-based engine - I know there’s more than one, but I’m not really familiar with the statuses of any of them.) See Using a 3rd-party database backend.

Side note: What you get from print(some_django_orm_statement.query) is not what is sent to the database. You’re looking at an “internal representation” of a query. The actual query is not generated until the queryset is executed. To see what the database engine is generating, you need to look at either the django connection object or a trace on the database.

Topic		Replies	Views
Using annotation to annotate Using Django	12	10256	September 11, 2020
Trying to get the total count of objects with a bool value of true to display on the template Here's what I have so far models.py Getting Started	1	915	April 18, 2022
Issue filtering queryset by comparing jsonfield value and aggregate value Using the ORM	2	430	April 17, 2024
How Query class work? Django Internals	3	581	May 10, 2022
using annotate to return number of users between two dates Using the ORM	9	5145	January 29, 2023