Annotate with Foreign Key

Hello everybody,

I have a model Category and a model Product. Product is a Foreign Key in Category. I now want to retrieve a list of dictionaries of categories with for every category ideally also a list of dictionaries of the products. I am using values() to retrieve the categories. This is working. The problem is to retrieve the products also in a list of dictionaries. I tried ArrayAgg, but I am not managing to combine those arrays in a dictionary or a nested array. I want to do as much as possible on the database level, because this is the most efficient.

So concrete:

class Category(models.Model):
name = models.CharField(max_length=50) # the name of the category

class Product(models.Model):
name = models.CharField(max_length=70) # the name of the product
category = models.ForeignKey(Category, related_name=‘products’, null=True, on_delete=models.SET_NULL)
image = models.ImageField(upload_to=‘images/products/’) # a link to the picture

What I want:

[{‘id’:543, ‘name’: ‘name_of_category’, ‘products’:[{‘id’:43,‘image’:‘image_of_product’},{…}]},…]

Thanks!

No, actually it’s not. Don’t confuse the Python representation with what’s stored internally in the database. Using the database to perform a representational transformation is not intrinsically “better” than performing the transformation in python.
(Using the values clause doesn’t change the query compared to using the same fields in an only clause. The difference of results only exists within Python.)

What you’re probably looking for is to implement a custom serializer to render your query results in the desired format.

Thanks Ken for the response!

I thought it is more efficient to not make use of the ORM (object instantiation, …) and to immediately use values() to reduce the overhead. The serializer is performed after the object is instantiated and this causes some overhead. What do you think is the most efficient way to fix this?

Before worrying about that, I’d do some actual timing to see whether or not it’s really a problem that needs to be addressed. If it is, then I’d start looking at self-coded solutions. But unless you’re talking about thousands of rows that you’re retrieving at once, I don’t think it’s going to make a real difference. (This is in the general case - there are situations where it might be a problem, but that’s highly context dependent.)

I understand!

Is there a way to write a custom expression that is similar with the PostgresSQL functionality: json_agg(json_build_object(‘id’,p.id,‘name’,p.name))?

Thanks!

To nest the related tables in a serializer, you’d need to install the django-rest-framework (DRF) as the core serializers will not do nested related tables.

Even at that, I’m still more inclined to go with a well-written and tested solution than trying to go the roll-my-own path. (I don’t like reinventing the wheel and I really don’t like writing code that someone else has already written, and that I can use to my benefit.)

I understand your way of thinking, but speed is really important here.

I managed to write the code in PostgresSQL but I want to do it now with the Django ORM. I think it is possible to write a new aggregate function to simulate the behaviour of the json_agg?

This is my code:
SELECT
c.id,
c.name,
json_agg(json_build_object(‘id’,p.id, ‘name’,p.name))
FROM
orders_category c inner join orders_product p on p.category_id = c.id
WHERE
c.venue_id = 16
GROUP BY
c.id, c.name;

If milliseconds matter (I work in an area where sometimes speed is important too), then I would suggest ignoring the ORM completely.

If this query does what you need it to do, and you’re not looking to return Python objects, I’d run it directly. It will bypass a lot of Django overhead not needed in this case.

See Executing custom SQL Directly on the Performing raw SQL queries page.