Is this a bug in Django's SQL creation through multiple many-to-many tables

Although I know how this sounds (it reminds me of when I thought I found a bug in perl in 2000), I think I found a bug in Django… but before I go through the effort of filing a bug report, I thought I would ask here to see if perhaps this is not a bug…

I have a rather complex database and an advanced search interface that creates queries through numerous many-to-many tables and it’s been working great for over a year now.

I recently added a feature to count distinct related table record occurrences in the joined results but I ran into a bug of my own in my code where I wasn’t counting those M:M related table records completely. When I applied a fix to supply missing fields to .distinct(), I couldn’t get the test to execute without hitting an InvalidColumnReference error. And though I’m supplying the same expanded dict to .order_by() that I am to .distinct(), the error claims that SELECT DISTINCT ON expressions must match initial ORDER BY expressions

When I print the SQL, the place where it notes a difference has a weird T8 reference where the model name should be, e.g. ..., "DataRepo_peakgroup"."id" ASC, T8."name" ASC, "DataRepo_compoundsynonym"."name" ASC, .... Note the **T8**."name".

The corresponding DISTINCT ON has “DataRepo_compoundsynonym” instead of T8.

Here’s the full SQL created:

SELECT DISTINCT ON ("DataRepo_peakgroup"."name", "DataRepo_peakgroup"."id", "DataRepo_compoundsynonym"."compound_id", "DataRepo_compoundsynonym"."name", "DataRepo_compoundsynonym"."name", "DataRepo_compound"."name", "DataRepo_peakgroup_compounds"."compound_id", "DataRepo_study"."name", "DataRepo_animal_studies"."study_id") "DataRepo_peakgroup"."id", "DataRepo_peakgroup"."name", "DataRepo_peakgroup"."formula", "DataRepo_peakgroup"."msrun_id", "DataRepo_peakgroup"."peak_group_set_id" FROM "DataRepo_peakgroup" INNER JOIN "DataRepo_msrun" ON ("DataRepo_peakgroup"."msrun_id" = "DataRepo_msrun"."id") INNER JOIN "DataRepo_sample" ON ("DataRepo_msrun"."sample_id" = "DataRepo_sample"."id") INNER JOIN "DataRepo_tissue" ON ("DataRepo_sample"."tissue_id" = "DataRepo_tissue"."id") LEFT OUTER JOIN "DataRepo_peakgroup_compounds" ON ("DataRepo_peakgroup"."id" = "DataRepo_peakgroup_compounds"."peakgroup_id") LEFT OUTER JOIN "DataRepo_compound" ON ("DataRepo_peakgroup_compounds"."compound_id" = "DataRepo_compound"."id") LEFT OUTER JOIN "DataRepo_compoundsynonym" ON ("DataRepo_compound"."id" = "DataRepo_compoundsynonym"."compound_id") LEFT OUTER JOIN "DataRepo_compound" T8 ON ("DataRepo_compoundsynonym"."compound_id" = T8."id") INNER JOIN "DataRepo_animal" ON ("DataRepo_sample"."animal_id" = "DataRepo_animal"."id") LEFT OUTER JOIN "DataRepo_animal_studies" ON ("DataRepo_animal"."id" = "DataRepo_animal_studies"."animal_id") LEFT OUTER JOIN "DataRepo_study" ON ("DataRepo_animal_studies"."study_id" = "DataRepo_study"."id") WHERE UPPER("DataRepo_tissue"."name"::text) = UPPER(Brain) ORDER BY "DataRepo_peakgroup"."name" ASC, "DataRepo_peakgroup"."id" ASC, T8."name" ASC, "DataRepo_compoundsynonym"."name" ASC, "DataRepo_compound"."name" ASC, "DataRepo_peakgroup_compounds"."compound_id" ASC, "DataRepo_study"."name" ASC, "DataRepo_animal_studies"."study_id" ASC

And here’s the code that generates it:

q_exp = Q(msrun__sample__tissue__name__iexact="brain")

fmt_distinct_fields = ['name', 'pk', 'compounds__synonyms__compound', 'compounds__synonyms__name', 'compounds__synonyms__pk', 'compounds__name', 'compounds__pk', 'msrun__sample__animal__studies__name', 'msrun__sample__animal__studies__pk']

all_fields = ['name', 'pk', 'compounds__synonyms__compound', 'compounds__synonyms__name', 'compounds__synonyms__pk', 'compounds__name', 'compounds__pk', 'msrun__sample__animal__studies__name', 'msrun__sample__animal__studies__pk', 'msrun__sample__animal__name', 'peak_data__labeled_element', 'compounds__name', 'msrun__sample__name', 'msrun__sample__tissue__name', 'msrun__sample__animal__tracer_compound__name', 'msrun__sample__animal__studies__name', 'msrun__sample__animal__feeding_status', 'msrun__sample__animal__tracer_infusion_rate', 'msrun__sample__animal__tracer_infusion_concentration']

PeakGroup.objects.filter(q_exp).order_by(*fmt_distinct_fields).distinct(*fmt_distinct_fields).values_list(*all_fields)

The above code works using smaller examples. Previously, I was only adding things to fmt_distinct_fields in specific instances based on the display of the joined table in our view, but I realized this was leading to inaccurate counts in the displayed stats, so I basically decided I needed to include all M:M table fields in that list. And it’s when I include them all that I hit the error.

Note, I typed the toy example by hand after doing a debug print of those variables, so please excuse any minor syntax error I may have introduced. Actually, I’ll go into the shell now and see if I can reproduce the error using that toy example.

Toy Confirmed. In fact, I don’t need the .values_list() at the end to make it happen…

In [1]: from DataRepo.models import PeakGroup

In [2]: from django.db.models import Q

In [3]: q_exp = Q(msrun__sample__tissue__name__iexact="brain")

In [4]: fmt_distinct_fields = ['name', 'pk', 'compounds__synonyms__compound', 'compounds__synonyms__name', 'compounds__synonyms__pk', 'compounds__name', 'compounds__pk', 'msrun__sample__animal__studies__name', 'msrun__sample__animal__studies__pk']

In [5]: all_fields = ['name', 'pk', 'compounds__synonyms__compound', 'compounds__synonyms__name', 'compounds__synonyms__pk', 'compounds__name', 'compounds__pk', 'msrun__sample__animal__studies__name', 'msrun__sample__animal__studies__pk', 'msrun__sample__animal__name', 'peak_data__labeled_element', 'compounds__na
   ...: me', 'msrun__sample__name', 'msrun__sample__tissue__name', 'msrun__sample__animal__tracer_compound__name', 'msrun__sample__animal__studies__name', 'msrun__sample__animal__feeding_status', 'msrun__sample__animal__tracer_infusion_rate', 'msrun__sample__animal__tracer_infusion_concentration']

In [6]: PeakGroup.objects.filter(q_exp).order_by(*fmt_distinct_fields).distinct(*fmt_distinct_fields)
Out[6]: ---------------------------------------------------------------------------
InvalidColumnReference                    Traceback (most recent call last)
~/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/db/backends/utils.py in _execute(self, sql, params, *ignored_wrapper_args)
     83             else:
---> 84                 return self.cursor.execute(sql, params)
     85 

InvalidColumnReference: SELECT DISTINCT ON expressions must match initial ORDER BY expressions
LINE 1: ..."."id", "DataRepo_compoundsynonym"."compound_id", "DataRepo_...
                                                             ^


The above exception was the direct cause of the following exception:

ProgrammingError                          Traceback (most recent call last)
~/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/IPython/core/formatters.py in __call__(self, obj)
    700                 type_pprinters=self.type_printers,
    701                 deferred_pprinters=self.deferred_printers)
--> 702             printer.pretty(obj)
    703             printer.flush()
    704             return stream.getvalue()

~/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/IPython/lib/pretty.py in pretty(self, obj)
    392                         if cls is not object \
    393                                 and callable(cls.__dict__.get('__repr__')):
--> 394                             return _repr_pprint(obj, self, cycle)
    395 
    396             return _default_pprint(obj, self, cycle)

~/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/IPython/lib/pretty.py in _repr_pprint(obj, p, cycle)
    698     """A pprint that just redirects to the normal repr function."""
    699     # Find newlines and replace them with p.break_()
--> 700     output = repr(obj)
    701     lines = output.splitlines()
    702     with p.group():

~/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/db/models/query.py in __repr__(self)
    254 
    255     def __repr__(self):
--> 256         data = list(self[:REPR_OUTPUT_SIZE + 1])
    257         if len(data) > REPR_OUTPUT_SIZE:
    258             data[-1] = "...(remaining elements truncated)..."

~/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/db/models/query.py in __len__(self)
    260 
    261     def __len__(self):
--> 262         self._fetch_all()
    263         return len(self._result_cache)
    264 

~/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/db/models/query.py in _fetch_all(self)
   1322     def _fetch_all(self):
   1323         if self._result_cache is None:
-> 1324             self._result_cache = list(self._iterable_class(self))
   1325         if self._prefetch_related_lookups and not self._prefetch_done:
   1326             self._prefetch_related_objects()

~/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/db/models/query.py in __iter__(self)
     49         # Execute the query. This will also fill compiler.select, klass_info,
     50         # and annotations.
---> 51         results = compiler.execute_sql(chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size)
     52         select, klass_info, annotation_col_map = (compiler.select, compiler.klass_info,
     53                                                   compiler.annotation_col_map)

~/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/db/models/sql/compiler.py in execute_sql(self, result_type, chunked_fetch, chunk_size)
   1173             cursor = self.connection.cursor()
   1174         try:
-> 1175             cursor.execute(sql, params)
   1176         except Exception:
   1177             # Might fail for server-side cursors (e.g. connection closed)

~/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/db/backends/utils.py in execute(self, sql, params)
     96     def execute(self, sql, params=None):
     97         with self.debug_sql(sql, params, use_last_executed_query=True):
---> 98             return super().execute(sql, params)
     99 
    100     def executemany(self, sql, param_list):

~/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/db/backends/utils.py in execute(self, sql, params)
     64 
     65     def execute(self, sql, params=None):
---> 66         return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
     67 
     68     def executemany(self, sql, param_list):

~/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/db/backends/utils.py in _execute_with_wrappers(self, sql, params, many, executor)
     73         for wrapper in reversed(self.db.execute_wrappers):
     74             executor = functools.partial(wrapper, executor)
---> 75         return executor(sql, params, many, context)
     76 
     77     def _execute(self, sql, params, *ignored_wrapper_args):

~/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/db/backends/utils.py in _execute(self, sql, params, *ignored_wrapper_args)
     82                 return self.cursor.execute(sql)
     83             else:
---> 84                 return self.cursor.execute(sql, params)
     85 
     86     def _executemany(self, sql, param_list, *ignored_wrapper_args):

~/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/db/utils.py in __exit__(self, exc_type, exc_value, traceback)
     88                 if dj_exc_type not in (DataError, IntegrityError):
     89                     self.wrapper.errors_occurred = True
---> 90                 raise dj_exc_value.with_traceback(traceback) from exc_value
     91 
     92     def __call__(self, func):

~/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/db/backends/utils.py in _execute(self, sql, params, *ignored_wrapper_args)
     82                 return self.cursor.execute(sql)
     83             else:
---> 84                 return self.cursor.execute(sql, params)
     85 
     86     def _executemany(self, sql, param_list, *ignored_wrapper_args):

ProgrammingError: SELECT DISTINCT ON expressions must match initial ORDER BY expressions
LINE 1: ..."."id", "DataRepo_compoundsynonym"."compound_id", "DataRepo_...

I’d be interested in examining this further.

Do you have “toy models” that can be used with this query to demonstrate the issue?

What database engine are you using?

What version of Django are you using?

I do not have a set of toy models to play with and it’s a private github repo (for work).

We’re on Django 3.2 using Postgres.

I was going to post an update yesterday because I’ve learned a bit more and was able to implement a work-around. The problem still exists, but our current query needs and model structure is such that I could code around it.

My hunch is that it has to do with composing the SQL when the relationship “path” goes though multiple “many-related” models. I was able to remove an unnecessary related path from the joined query:

PeakGroup->Compound->CompoundSynonym

I removed CompoundSynonym fields from the query (because I didn’t really need them) and once I did that, the mismatch alias creation in the distinct clause versus the order by clause went away.

  • PeakGroup:Compound is a many-to-many relationship
  • Compound:CompoundSynonym is a 1-to-many relationship

One other “many-related” relationship is involved in the query:

  • Animal:Study (M:M)

Perhaps one complicating factor that could be related is that there are 2 tables that link to Compound in the query. Animal links to Compound. This is a “tracer compound” and is a M:1 relationship. The PeakGroup to Compound link is a “measured compound” (the M:M relationship). My work-around query still includes both links to compound.

I also implemented a catch for the issue, incase future codebase edits reintroduce the problem:

        except ProgrammingError as pe:
            if len(
                all_distinct_fields
            ) > 1 and "SELECT DISTINCT ON expressions must match initial ORDER BY expressions" in str(
                pe
            ):
                raise UnsupportedDistinctCombo(
                    all_distinct_fields, self.modeldata[fmt].__class__.__name__
                )
            else:
                raise pe
class UnsupportedDistinctCombo(Exception):
    def __init__(self, fields, fmt_cls_name):
        message = (
            f"Unsupported combination of distinct fields: [{', '.join(fields)}].  This probably arose from "
            "FormatGroup.getQueryStats(), which compiles a list of distinct fields in many-to-many related tables "
            f"using the fields found in the stats defined here: [{fmt_cls_name}.stats] and rows that are split in the "
            f"views defined here: [{fmt_cls_name}.model_instances[*][manytomany][split_rows]].  To fix this error, "
            f"you either need to remove fields from the [{fmt_cls_name}.stats], or make split_rows false that are "
            "problematic.  Look for tables in the stats that include multiple many-related tables in its field path "
            "and remove thos field entries.  The problem stems from a likely distinct SQL generation bug in Django "
            "when dealing with transiting multiple M:M or 1:M tables int he same path."
        )
        super().__init__(message)
        self.fields = fields

So there should be enough here to make a toy model set, but it would be trial and error to try and reproduce the issue.

I was just trying to strip down our code to create a set of toy models and I ran across something that is perhaps significant…

compounds__synonyms__name and compounds__synonyms__pk are the same field:

class CompoundSynonym(Model):
    name = CharField(
        primary_key=True,
        max_length=256,
    )
    compound = ForeignKey(
        Compound,
        related_name="synonyms"
    )
    class Meta:
        verbose_name = "synonym"
        verbose_name_plural = "synonyms"
        ordering = ["compound", "name"]

Could that have confused the aliasing?

I was just looking in my code that gathers the distinct fields to include in .distinct() and .order_by() and I noted this comment:

# Don't assume the ordering fields are populated/unique, so include the primary key.  Duplicate fields
# should be OK (though I haven't tested it).

I initially add the order-by fields and the line below that comment tacks on the primary key field (just in case).

I will try checking first if the order-by field is the primary key and skip it if so. I’ll then disable my work-around and see if that more precisely identifies the problem or rules it out.

No, that’s not it. Exception still occurs. I also tried excluding only fields from the Compound table. So it only seems to happen when I include fields from CompoundSynonym via the M:M → 1:M path: PeakGroup->Compound->CompoundSynonym.

OK. I managed to reproduce the issue with a set of toy models:

toymodels.py:

from django.db.models import Model, CharField, AutoField, ForeignKey, ManyToManyField, CASCADE

class TestPeak(Model):
    id = AutoField(primary_key=True)
    name = CharField(max_length=10)
    compounds = ManyToManyField(
        to="TestCompound",
        related_name="testpeaks",
    )
    class Meta:
        verbose_name = "testpeak"
        verbose_name_plural = "testpeaks"
        ordering = ["name"]

class TestCompound(Model):
    id = AutoField(primary_key=True)
    name = CharField(max_length=10)
    class Meta:
        verbose_name = "testcompound"
        verbose_name_plural = "testcompounds"
        ordering = ["name"]

class TestSynonym(Model):
    name = CharField(max_length=10, primary_key=True)
    compound = ForeignKey(
        TestCompound, related_name="testsynonyms", on_delete=CASCADE
    )
    class Meta:
        verbose_name = "testsynonym"
        verbose_name_plural = "testsynonyms"
        ordering = ["compound", "name"]

test_bug.py:

from DataRepo.tests.tracebase_test_case import TracebaseTestCase
from DataRepo.models.toymodels import TestPeak, TestCompound, TestSynonym
from django.db.models import Q

class DjangoSQLBug(TracebaseTestCase):
    maxDiff = None

    @classmethod
    def setUpTestData(cls):
        TestCompound.objects.create(name="testcpd")
        cpd = TestCompound.objects.get(id__exact=1)
        TestSynonym.objects.create(name="testsyn",compound=cpd)
        TestPeak.objects.create(name="testpk")
        pk = TestPeak.objects.get(id__exact=1)
        pk.compounds.add(cpd)

    def test_mm_om_query(self):
        q_exp = Q(name__iexact="testpk")
        distinct_fields = ['name', 'pk', 'compounds__testsynonyms__compound', 'compounds__testsynonyms__name', 'compounds__testsynonyms__pk', 'compounds__name', 'compounds__pk']
        qs = TestPeak.objects.filter(q_exp).order_by(*distinct_fields).distinct(*distinct_fields)
        self.assertEqual(qs.count(), 1)

python manage.py test output:

Creating test database for alias 'default'...
Creating test database for alias 'validation'...
System check identified no issues (0 silenced).
E
======================================================================
ERROR: test_mm_om_query (DataRepo.tests.sqlbugtest.test_bug.DjangoSQLBug)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
psycopg2.errors.InvalidColumnReference: SELECT DISTINCT ON expressions must match initial ORDER BY expressions
LINE 1: ...peak"."id", "DataRepo_testsynonym"."compound_id", "DataRepo_...
                                                             ^


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/DataRepo/tests/sqlbugtest/test_bug.py", line 21, in test_mm_om_query
    self.assertEqual(qs.count(), 1)
  File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/db/models/query.py", line 412, in count
    return self.query.get_count(using=self.db)
  File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/db/models/sql/query.py", line 519, in get_count
    number = obj.get_aggregation(using, ['__count'])['__count']
  File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/db/models/sql/query.py", line 504, in get_aggregation
    result = compiler.execute_sql(SINGLE)
  File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/db/models/sql/compiler.py", line 1175, in execute_sql
    cursor.execute(sql, params)
  File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/db/backends/utils.py", line 66, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/db/backends/utils.py", line 75, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/db/utils.py", line 90, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/Users/rleach/PROJECT-local/TRACEBASE/tracebase/.venv/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
django.db.utils.ProgrammingError: SELECT DISTINCT ON expressions must match initial ORDER BY expressions
LINE 1: ...peak"."id", "DataRepo_testsynonym"."compound_id", "DataRepo_...
                                                             ^


----------------------------------------------------------------------
Ran 1 test in 0.018s

FAILED (errors=1)
Destroying test database for alias 'default'...
Destroying test database for alias 'validation'...
gen-rl-macbookair[2022-05-05 12:34:44]:~/PROJECT-local/TRACEBASE/tracebase$ 

@KenWhitesell - Do you think I should just file a bug at this point? Seems confirmed enough to me… What do you think?

You certainly could, it wouldn’t be wrong to do so. (You’d want to include the complete testing code to allow anyone to look at / evaluate this.)

Now that you’ve provided a reproduceable test case, I am curious enough to want to look more closely at this myself - but I’m definitely not an ORM expert.

https://code.djangoproject.com/ticket/33682

Can please share the definition of TracebaseTestCase as in

from DataRepo.tests.tracebase_test_case import TracebaseTestCase

I am not able figure this out.

Absolutely, but I don’t think it is going to be helpful. It’s very simple:

from django.test import TestCase


class TracebaseTestCase(TestCase):
    """
    This wrapper of TestCase makes the necessary/desirable settings for all test classes.
    """

    maxDiff = None
    databases = "__all__"

    class Meta:
        abstract = True

Did you look at the bug ticket I filed (commented above)? They seem to have figured it out.