A specific ManyToMany query crash

Hey folks,

While I was exploring how to perform a specific kind of ManyToMany query, namely is there a way to filter the relationship by exact set? For instance, .filter(m2m=foo) will find all objects with a relation to foo, but what should we do if we want objects where the relation set is limited to only foo or only foo and bar? So I came up with a set of techniques for doing this and enumerated them in this gist.

One of the techniques raises a TypeError and I’d like your opinion on if that might be a bug in Django that should be filed, or something else.

We use the classic example from the Django documentation of Pizzas with Toppings!

class Topping(models.Model):
    name = models.CharField(max_length=30)
    def __str__(self):
        return self.name

class Pizza(models.Model):
    name = models.CharField(max_length=50)
    toppings = models.ManyToManyField(Topping, related_name="pizzas")
    def __str__(self):
        return "%s (%s)" % (
            self.name,
            ", ".join(topping.name for topping in self.toppings.all()),

class Array(Func):
    function = 'ARRAY'
    template = '%(function)s[%(expressions)s]'

subquery = Subquery(
    Topping.objects.filter(
        pizzas=OuterRef('id'),
    ).order_by()
    .values(
        'pizzas'
    ).annotate(
        its_toppings=ArrayAgg('id'),
    )
    .values(
        'its_toppings',
    ),
    output_field=ArrayField(IntegerField),
)
pizzas = Pizza.objects.annotate(its_toppings=subquery)
cheese, _ = Topping.objects.get_or_create(name="cheese")
onions, _ = Topping.objects.get_or_create(name="onions")
pizzas = pizzas.filter(its_toppings=Array(cheese.id, onions.id))

Evaluating that pizzas QuerySet will raise the following exception:

>>> a = list(pizzas.filter(its_toppings=Array(cheese.id, onions.id)))
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/Users/speno/Documents/venvs/m2m/lib/python3.8/site-packages/django/db/models/query.py", line 276, in __iter__
    self._fetch_all()
  File "/Users/speno/Documents/venvs/m2m/lib/python3.8/site-packages/django/db/models/query.py", line 1261, in _fetch_all
    self._result_cache = list(self._iterable_class(self))
  File "/Users/speno/Documents/venvs/m2m/lib/python3.8/site-packages/django/db/models/query.py", line 57, in __iter__
    results = compiler.execute_sql(chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size)
  File "/Users/speno/Documents/venvs/m2m/lib/python3.8/site-packages/django/db/models/sql/compiler.py", line 1139, in execute_sql
    sql, params = self.as_sql()
  File "/Users/speno/Documents/venvs/m2m/lib/python3.8/site-packages/django/db/models/sql/compiler.py", line 506, in as_sql
    where, w_params = self.compile(self.where) if self.where is not None else ("", [])
  File "/Users/speno/Documents/venvs/m2m/lib/python3.8/site-packages/django/db/models/sql/compiler.py", line 423, in compile
    sql, params = node.as_sql(self, self.connection)
  File "/Users/speno/Documents/venvs/m2m/lib/python3.8/site-packages/django/db/models/sql/where.py", line 81, in as_sql
    sql, params = compiler.compile(child)
  File "/Users/speno/Documents/venvs/m2m/lib/python3.8/site-packages/django/db/models/sql/compiler.py", line 423, in compile
    sql, params = node.as_sql(self, self.connection)
  File "/Users/speno/Documents/venvs/m2m/lib/python3.8/site-packages/django/db/models/lookups.py", line 176, in as_sql
    lhs_sql, params = self.process_lhs(compiler, connection)
  File "/Users/speno/Documents/venvs/m2m/lib/python3.8/site-packages/django/db/models/lookups.py", line 169, in process_lhs
    db_type = self.lhs.output_field.db_type(connection=connection)
  File "/Users/speno/Documents/venvs/m2m/lib/python3.8/site-packages/django/contrib/postgres/fields/array.py", line 83, in db_type
    return '%s[%s]' % (self.base_field.db_type(connection), size)
TypeError: db_type() missing 1 required positional argument: 'connection'

And other lookups have different results, including a working result when using len:

>>> pizzas.filter(its_toppings__contains=1)
TypeError: cast_db_type() missing 1 required positional argument: 'connection'

>>> pizzas.filter(its_toppings__len=2)
<QuerySet [<Pizza: mushroom (cheese, mushrooms)>, <Pizza: onion (cheese, onions)>]>

This error was found in Django versions 2.2.14, 3.0.8, and also in the main branch of the repo (156a2138db20abc89933121e4ff2ee2ce56a173a at the time of testing).

So, is there something worth reporting as a bug there? If not, please help me understand what else is going on.

Thanks!

I am not sure why you’re seeing this crash. I think your custom Array function has managed to map back to the ArrayField in django.contrib.postgres because of its return type, but it seems the ArrayField has been instantiated wrongly. It may be that you need to copy the output_field property from ArrayAgg, to correctly set up what the retrun field is.

Thanks, Adam!

The code did not crash once I removed the output_field from the Subquery.

While debugging this issue, I had tried different values for output_field there but I never tried not using it. I did not understand that it was only needed when Django could not determine on its own what the resulting field type should be. I now see this is documented, albeit in a different section on the https://docs.djangoproject.com/en/3.0/ref/models/expressions/ page. Even though the Subquery docs clearly show it defaults to None, I’m thinking that referring back to that section in the https://docs.djangoproject.com/en/3.0/ref/models/expressions/#aggregate-expressions would have helped me.

There was also some “cargo culting” on my part as I took code I had seen in the django-sql-utils readme as the basis for what I was trying to do without understanding it fully.

Take care.