How does serializer retrieve columns referred by a reverse foreign key?

simpleopen · February 24, 2022, 2:18am

In a scenario of transaction processing, we have:

table order to record incoming transactions, and
table order_log to log updates on a recorded order, with a foreign key to table order.

A recorded order may have zero to multiple updates on the log.

We want to make a flattened view of the two tables like a SQL query selecting from order left outer join order_log, with the following behaviour:

if the order has no update, list the order in joint with null values;
if the order has one update, list the order in joint with the update log;
if the order has multiple updates, list the order multiple times in joint with each update log.

As the sample source code below, we used .prefetch_related('orderlog_set'), and it gets the wanted flattening effect, and the Django backend log shows a left-outer-join SQL query sent to the database as expected. So, the mutual relation works correctly for all the three conditions above.

However, we cannot retrieve columns in order_item, as the table is in a foreign key’s reverse direction.

The serializer FlatOrderLogSerializer bases its meta-model on the table order, so it refers to table order_log by a foreign key’s reverse or backward direction. As a result, the serializer cannot retrieve the correct column and always gets null values.

We correctly retrieved columns in the native table and tables referred by a foreign key’s normal direction. Unfortunately, just the reverse/backward foreign key does not work.

We are new to this part of Django and not sure how to correctly set the source attribute of the field object. We tried serializers.ReadOnlyField(source='orderlog_set__update_ts', ... and a few other options, but not working so far. See the comments in the sample code for more details.

Just let me know if you need more information, and any suggestions will be highly appreciated.

# Table 'order'
class Order(models.Model):
    ...

# Table 'order_log'
class OrderLog(models.Model):
    order = models.ForeignKey('Order')
    update_ts = models.DateTimeField(auto_now=True, editable=False, verbose_name="Last Updated On")
    ...

# The serializer
class FlatOrderLogSerializer(serializers.ModelSerializer):
    # Standing at table 'order', it refers to table 'order_log' by a reverse (backward) foreign-key.
    # The field with source='orderlog_set__update_ts' always gets null, and
    # we tried other source settings, e.g. 'orderlog_set.update_ts', or 'orderlog.update_ts', 
    # unfortunately nothing works so far.
    update_ts = serializers.ReadOnlyField(source='orderlog_set__update_ts', allow_null=True)
    ...
    class Meta:
        model = Order
        fields = (
            'update_ts',
            ...
        )

# The view
class FlatOrderLogView(generics.ListAPIView):
    serializer_class = rest_models.FlatOrderLogSerializer
    ...
    def get_queryset(self):
        flat_orderlogs = Order.active_objects.filter(
            ...
        ).prefetch_related(
            'orderlog_set',
        ).all()
        return flat_orderlogs

simpleopen · March 8, 2022, 1:07am

After some Google searching, it looks like:

The Django framework provides a standard function select_related() for joining following a foreign key’s forward direction. The backend implements this function as an inner-join SQL query.
However, we did not find an elegant way to look backward by an outer-join. Instead, most articles recommend using function prefetch_related() and making a nested serializer. And this design pattern seems to be a common practice.

Below are some tentative thoughts, and we hope to get feedback or corrections from experts:

With SQL query, we can refer to another table in the foreign key’s backward direction by an outer-join. However, it seems that Django does not encourage this intuitive design pattern, therefore, did not implement a standard method. If confirmed, we will give up the outer-join query and look into other solutions instead.

By the way, we may also want to make a little clarification on terminology:

We did not find any textbook definition of the foreign key’s directional attribute, e.g. the forward, backward, or reverse directions; however, these words exist in the online literature.

In the ER diagram below, the app_orderitemlog table contains a foreign key pointing to the app_orderitem table. So, with this foreign key relationship:

The app_orderitemlog table refers to the app_orderitem table in the foreign key’s forward direction.
The app_orderitem table refers to the app_orderitemlog table in the backward direction, also called the reverse direction.

er-diagram

KenWhitesell · March 8, 2022, 3:19am

I’m trying to follow what you’re saying here, but I’m lost.

Is there still a question or issue needing to be resolved?

If the title is still an accurate description, you serialize a related table using one of the related serializers, possibly the nested serializers

simpleopen · March 8, 2022, 7:02pm

Thank you for your pointers, @KenWhitesell.

The scenario is transaction processing. As the ER diagram shows above, we have:

an order table ( app_orderitem ), and
a log table ( app_orderitemlog ) that records the actions taken to the order, e.g. “fulfillment”, “refund”, “cancel”, etc. With our current design, an order has no logging event if the back-office staff has not taken action yet.

Yes, we can use a nested serializer so that each order contains its log events. Correspondingly, the web user interface will display a list of orders, each with a plus sign, and the user can click on the plus sign to expand the log events. However, this solution means changing our requirements.

The transaction history (or audit trail) is usually a flat list of log events with the order details. Moreover, the list also needs to include the orders with no logging event, similar to an outer-join SQL query.

Strictly speaking, the audit trail of an order starts when the transaction request arrives, so there should be a “getting created” event in its log before any manual actions of the back-office staff. Therefore, a better solution is to improve the transaction processing logic so that each order will have at least one beginning log event. In this way, we can use the select_related() function to start from the log table and inner-join with the order table.

It takes significant effort to change an existing transaction processing module, so we want to do some homework and make sure it is necessary. For example, if there is an easy way to do an outer-join SQL query, we do not have to make this change. Does this answer your questions about our intention?

Thank you again for your reply.

KenWhitesell · March 8, 2022, 8:05pm

You are conflating two separate and distinct issues here.

One issue is the retrieval of data from the database. That’s where you use the nested serializer to retrieve the log data.

The second, separate, and distinct issue is the presentation of that data to the user.

Once you have the data needed, you can perform whatever transformations desired on that data to create whatever representation you want to provide to your users.

The issue here then isn’t the queries being generated, but the representational transformations needing to be applied to that data for display.

simpleopen · March 8, 2022, 11:39pm

Yes, @KenWhitesell, you’re right. We focused on a specific question about the serializer, but the system design problem involves other aspects as our conversation gradually unfolds.

We need the pagination to divide a big-sized data set into small-sized pages for fast backend processing. The nested data structure means to paginate on the orders, and yes, we can transform the data page to a flat format for display.

However, the same-sized pages will render uneven numbers of records because some orders contain more log events. For example, an order may have a history of back and forth, including actions of “fulfillment”, “reverse-fulfillment”, then “fulfillment” again.

To avoid the above issue, we need to build a queryset in the flat format before passing it to the paginator. In my opinion, transformation after pagination will lead to issues.

Again, we want to get feedback from exports to prevent low-level mistakes in the reasoning chain leading to the technical decision. Thank you for your reminder.

millerjason0 · January 1, 2024, 4:24am

simpleopen:

The scenario is transaction processing. As the ER diagram shows above, we have:

an order table ( app_orderitem ), and

a log table ( app_orderitemlog ) that records the actions taken to the order, e.g. “fulfillment”, “refund”, “cancel”, etc. With our current design, an order has no logging event if the back-office staff has not taken action yet.

Yes, we can use a nested serializer so that each order contains its log events. Correspondingly, the web user interface will display a list of orders, each with a plus sign, and the user can click on the plus sign to expand the log events. However, this solution means changing our requirements.

The transaction history (or audit trail) is usually a flat list of log events with the order details. Moreover, the list also needs to include the orders with no logging event, similar to an outer-join SQL query.

Strictly speaking, the audit trail of an order starts when the transaction request arrives, so there should be a “getting created” event in its log before any manual actions of the back-office staff. Therefore, a better solution is to improve the transaction processing logic so that each order will have at least one beginning log event. In this way, we can use the select_related() function to start from the log table and inner-join with the order table.

It takes significant effort to change an existing transaction processing module, so we want to do some homework and make sure it is necessary. For example, if there is an easy way to do an outer-join SQL query, we do not have to make this change. Does this answer your questions about our intention?

Yes, your explanation clarifies the need to reconsider your current approach. Instead of altering the transaction processing logic to ensure each order has a beginning log event, you’re exploring alternatives like using SQL outer-joins or other mechanisms to achieve a flat list of log events that also includes orders without any logging events. This evaluation is essential before committing to significant changes in your transaction processing module.

Topic		Replies	Views
Attach models to queryset in api serializer Using the ORM	1	1123	May 18, 2022
Setting both sides of OneToOne field using the related_name on a serializer Using Django	2	1338	November 8, 2021
Use of select_related in nested serializers in rest framework Using Django	3	3687	October 6, 2021
Annotate with Foreign Key Using Django	7	4817	September 12, 2020
How to use prefetch_related to retrieve multiple rows similar to SQL result Forms & APIs	2	647	March 10, 2025

How does serializer retrieve columns referred by a reverse foreign key?

Related topics