Aggregate/annotate based on 2 different models

Hi!

I am stuck and can not figure out how availability in a rental system:
Background:
To learn django, I am trying to scratch an itch and create a rental system. It has 3 models:

  • Products, contains data about a product such as name and price
  • Projects, an event where the products are rented out. It contains a start/end date and some general info
  • Transactions, this is where a store all the “line items”. Each has a foreign key to product and project, and a quantity.

What I want to do, is when I created a new project, show a table of products and show the number of available of each item.
For each product, filter out the transactions that overlap the current project, and calculate the available quantity for each product.

class Product(models.Model):
    name = models.CharField(max_length=200)
    objects = ProductManager()
class Project(models.Model):
    name = models.CharField(max_length=200)
    description = models.CharField(max_length=400, blank=True)
    start_date = models.DateField(editable=True, blank=True, null=True)
    end_date = models.DateField(editable=True, blank=True, null=True)
class Transaction(models.Model):
    project = models.ForeignKey(Project, on_delete=models.CASCADE, related_name='transactions')
    product = models.ForeignKey(Product, on_delete=models.CASCADE, related_name='transactions')
    quantity = models.IntegerField()

    @property
    def available(self):
        bought = self.transactions.filter(type='ADD').aggregate(
            Sum('quantity'))['quantity__sum']
        out = self.transactions.filter(type='REN').aggregate(
            Sum('quantity'))['quantity__sum']
        if bought is None:
            bought = 0
        if out is None:
            out = 0
        return bought-out

I have tried to in the view do things like:
products = Product.objects.event_filter(project.start_date, project.end_date).all()
where the event filter is

    def event_filter(self, start_date, end_date):
        lookup = ((Q(transactions__type='REN') & Q(transactions__project__start_date__lt=end_date) & Q(transactions__project__end_date__gt=start_date)) |
                  (~Q(transactions__type='REN') & (Q(transactions__project__start_date__lte=start_date) | Q(transactions__project__end_date__lt=start_date))) |
                  (Q(transactions__id=None))
                  )
        return self.get_queryset().filter(lookup).distinct()

and in the template use, but the event filter added in the view gets “ignored”.:

    <table class="product-table">
        <tr>
            <th>Product</th>
            <th>Available</th>
        </tr>
        <tr>
            <td>{{product.name}}</td>
            <td>{{product.available}}</td>
        </tr>

        {% endfor %}
    </table>

I can not figure out how to handle that the start/end date moves. Annotation feels like it should be on the right track, but I can not figure out how do it with the dynamic dates instead of the fixed ratings for books in the tutorials and documentation.

I feel like I am missing something. Any tips?

Can you explain what you mean by this? Is the collection of products correctly filtered as you expect and the {{ product.available }} wrong, vice versa or something else entirely?

Negating conditionals in many-to-many relationships is really tricky, because such a clause might be true for one row related to the base object, but have a different row also related to that same object that is false - and that can create some “intermediate results” of the query to not be what you’re intending.

A lot of times I find it easier to build the base query on the join table, where you can reference the related tables directly.

Also, your logic to identify overlapping ranges can be simplified.

If you have two ranges, (start_1, end_1) and (start_2, end_2), then these two ranges overlap only if start_1 <= end_1, start_2 <= end_2 (the end dates must be after the start dates), and end_1 >= start_2 and start_1 <= end_2.

As a first idea then, you might be able to add an annotation to a query to retrieve Product, where the annotation has a subquery to calculate the sum of all Transaction with dates overlapping the Project start and end dates, and subtract that value from the quantity available in Product.

(Note - I’m also assuming you have a quantity field in Product identifying the total number of Product that you have.)

It calculates the availability for all transactions ever made. Ignoring the dates.

Thanks, I have read your post and will have to read a few more times.

I do not store the total owned/available in a quantity field in product, (but have it as a property). Buying and selling, is also stored as transactions. Each transaction has a type and buy/sell only have a start/end date. To be able to handle the case of a planned purchase. (Maybe I should change the sales and broken/lost state to have a negative quantity in the database to make life easier)

You mention “the join table” is it the Transactions model in my case?

I have to study up on subqueries and annotation and do some testing, but I think you are suggesting something like before I have some follow ups.

That is correct.

Your Transaction model could be defined as the through table of a ManyToManyField between Project and Product.

e.g. In your Project model, you could define:
product = models.ManyToManyField(Product, through='Transaction')

This is in addition to your current Transaction model, it does not replace it. It would allow you to use some of the “shorthand” notation available for accessing and defining entries between Project and Product - whether or not it adds any value to what you are trying to do isn’t a question I can answer.

Thanks, I must try that and think about to see if it makes sense.

I got calculating availability working as property, but it might be a bit stupid, and I think it does a lot of queries.

Using this works very well when showing what is booked for a project and how much more is available, but when showing a list of products to add to an empty project, it gets tricker, since at that point there are not transactions yet.

    @property
    def available(self):
        project = self.project
        product = self.product
        try:
            added = self._meta.model.objects.all().filter(product=product).filter(
                type='ADD').event_filter(project.start_date, project.end_date).aggregate(total=Sum('quantity')).get('total')
        except:
            added = 0
        try:
            removed = self._meta.model.objects.all().filter(product=product).filter(type='REM').event_filter(project.start_date, project.end_date) | self._meta.model.objects.all(
            ).filter(type='DMG').event_filter(project.start_date, project.end_date).aggregate(total=Sum('quantity')).get('total')
        except:
            removed = 0
        try:
            rented = self._meta.model.objects.all().filter(product=product).filter(
                type='REN').event_filter(project.start_date, project.end_date).aggregate(total=Sum('quantity')).get('total')
        except:
            rented = 0

        available = added-removed-rented
        return available

Just a side note, without addressing the specifics of the implementation of your available method:

In general, this type of function more properly belongs in a custom model manager and not in the model itself.

A model method should relate to one specific instance of the model. A model manager method is more appropriate where the method refers to a collection of model instances. That type of method can be used directly to retrieve a value, or be written as a subquery to be used within other queries, or as a custom queryset - you have lots of options at that point.

I am still working on getting this to work well.

Have I designed this in a stupid way? Is there a smarter way to design a rental system?

I can’t answer that question. I don’t know enough about your requirements to have an informed opinion.

It’s not like there’s only one good way to design such a system. Creating an optimal design for a system should also take into account how that system is going to be used - not just the data available to be stored. You also need to understand how that data is going to flow through your system and what other processing needs to occur (e.g. periodic reports or dashboards.)

Thanks for all of your help and comments!

I have managed to get the result in a view with a Subquery. However I now realise it will be difficult to get it into a queryset or manager. I have Transactions in another app, and if import it into this model.py I will get errors due to circle references. Is there another way to do this reusable or should I just combine the apps?

    start_date = project.start_date
    end_date = project.end_date
    rental_filter = Q(
        Q(type='REN')
        & Q(project__start_date__lt=end_date)
        & Q(project__end_date__gt=start_date)
    )
    add_remove_filter = Q(
        Q(
            ~ Q(type='REN')
            # everything that adds products
            & Q(project__start_date__lte=start_date)
        )
        | Q(
            ~ Q(type='REN')
            # everything that removes products
            & Q(project__end_date__lt=end_date)
        )
    )

    rentals_subquery = (Transaction.objects
                        .values('product__id')  # group by product
                        .filter(rental_filter)
                        .filter(product__id=OuterRef('id'))
                        .values_list(Sum('quantity')))
    owned_subquery = (Transaction.objects
                      .values('product__id')  # group by product
                      .filter(add_remove_filter)
                      .filter(product__id=OuterRef('id'))
                      .values_list(Sum('quantity')))

    products = (Product.objects
                .all()
                .annotate(
                    rentals=Coalesce(Subquery(rentals_subquery), 0),
                    owned=Coalesce(Subquery(owned_subquery), 0))
                .annotate(
                    available=F('owned')-F('rentals'))
                .annotate(
                    availability_css_class=Case(
                        When(Q(available__lt=0), then=Value(
                            "availability-missing")),
                        default=Value("availability-in-stock")
                    ), output_field=CharField()
                )
                )

I’m a member of the “One app until proven otherwise” camp. My opinion is that an app should effectively be a stand-alone component and not just a way to separate code. So yes, I very much believe that your entire rental system should be one app.

It seems very common to recommend splitting code into apps, which is why I have done it that way. Looking into packaging models and views, I have now read it multiple times, that if I need to use multiple files I should create a new app instead. But no answers regarding cross app subqueries.

I have defined the annotation/subquery solution above both as models.QuerySet function, but had to do it both in the product model and the transaction model to be able to do both:

{% for product in products %}
    <li>{{product.name}} - {{product.available}}</li>
{% endfor %}


{% for transaction in transactions %}
    <li>{{transaction.product.name}} - {{transaction.available}}</li>
{% endfor %}

But this does not feel very DRY. Even if it is an annotation is it possible to “reach it” like property?
to be able to do {{transaction.product.available}}

<opinion> That seems to me to be a potentially dangerous generalization with a lot of context missing around that discussion. </opinion>
I’m curious, what sources have led you to this opinion?
Also, see the various discussions here about this same topic.

See

as a starting point. In particular, note Andrew’s opening paragraph:

The main purpose of apps is, in my eyes, to provide logical separation of reusable components - specifically, a first-class namespace for models/admin/etc. - and to provide an easy way to turn things “on” or “off”.

Regarding your core question here, I’m not understanding how you’ve got these pieces assembled. You’ve shown a couple of different snippets, but being out of context of the files/classes in which they’re defined and used.

Note that at Aggregate/annotate based on 2 different models - #8 by KenWhitesell, I’ve already expressed the opinion that this type of work belongs in a custom manager and not in a model method.

Also, to avoid an N+1 query situation, you want this work to be done within the view and passed to the template through the context and not rely upon the template executing these queries. I believe you’re going to be much better off if you calculate these values as annotations in your query and not as model properties.

Hi!

I had it defined like this as QuerySet method in my models.py, but now moved it to a manager, I am not clear to the difference:

class ProductQuerySet(models.QuerySet):
    def annotate_availability(self, project):
        start_date = project.start_date
        end_date = project.end_date
        rental_filter = Q(
            Q(type='REN')
            & Q(project__start_date__lt=end_date)
            & Q(project__end_date__gt=start_date)
        )
        add_remove_filter = Q(
            Q(
                ~ Q(type='REN')
                # everything that adds products
                & Q(project__start_date__lte=start_date)
            )
            | Q(
                ~ Q(type='REN')
                # everything that removes products
                & Q(project__end_date__lt=end_date)
            )
        )

        rentals_subquery = (Transaction.objects
                            .values('product__id')  # group by product
                            .filter(rental_filter)
                            .filter(product__id=OuterRef('id'))
                            .values_list(Sum('quantity')))
        owned_subquery = (Transaction.objects
                        .values('product__id')  # group by product
                        .filter(add_remove_filter)
                        .filter(product__id=OuterRef('id'))
                        .values_list(Sum('quantity')))

        return (self.all()
                    .annotate(
                        rentals=Coalesce(Subquery(rentals_subquery), 0),
                        owned=Coalesce(Subquery(owned_subquery), 0))
                    .annotate(
                        available=F('owned')-F('rentals'))
                    .annotate(
                        availability_css_class=Coalesce(Case(
                            When(available__lt=0, then=Value(
                                "availability-missing")),
                            default=Value("availability-in-stock")
                        ),Value("availability-in-stock"))
                    )
                    )

and I combined the previous 3 apps (projects, products, transactions) into one and also added something similar “to” the transaction model.

class TransactionQuerySet(models.QuerySet):
    # Custom, chainable methods added here, which will
    # do lower level 'filter', 'order_by' etc.
    def ungrouped(self):
        return self.filter(group=None)

    def annotate_availability(self):
        start_date = self.first().project.start_date
        end_date = self.first().project.end_date
        rental_filter = Q(
            Q(type='REN')
            & Q(project__start_date__lt=end_date)
            & Q(project__end_date__gt=start_date)
        )
        add_remove_filter = Q(
            Q(
                ~ Q(type='REN')
                # everything that adds products
                & Q(project__start_date__lte=start_date)
            )
            | Q(
                ~ Q(type='REN')
                # everything that removes products
                & Q(project__end_date__lt=end_date)
            )
        )

        rentals_subquery = (Transaction.objects
                            .values('product__id')  # group by product
                            .filter(rental_filter)
                            .filter(product__id=OuterRef('product__id'))
                            .values_list(Sum('quantity')))
        owned_subquery = (Transaction.objects
                          .values('product__id')  # group by product
                          .filter(add_remove_filter)
                          .filter(product__id=OuterRef('product__id'))
                          .values_list(Sum('quantity')))

        return (self
                .annotate(
                    rentals=Coalesce(Subquery(rentals_subquery), 0),
                    owned=Coalesce(Subquery(owned_subquery), 0))
                .annotate(
                    available=F('owned')-F('rentals'))
                .annotate(
                    availability_css_class=Coalesce(Case(
                        When(available__lt=0, then=Value(
                            "availability-missing")),
                        default=Value("availability-in-stock")
                    ), Value("availability-in-stock"))
                )
                )

The reason I added it as custom QuerySet methods instead of Managers was that I had it defined (for filters and such after reading this post Thin views) and I am not clear on the difference.

In the view, I have both

project = Project.objects.get(id=pk)
products = Product.objects.annotate_availability(project)
transactions = project.transactions.all().annotate_availability()

to be able to display a list to the user of what have been added to the project (transactions) and show a list of products and how many of each are available, for the user to be able to add products to the project.

Regarding many apps, that was my interpetation of for example these pages:

I have read “each Django app should do one thing, and one thing alone.” a few times but perhaps I took it to far.

Regarding managers, let’s take a step back.

If you have a query SomeModel.objects.filter(...), objects is the model manager. This means that filter is a method on that manager.

This means that if you create a manager, then the methods in that manager class are available using the same syntax. Let’s say you have created a manager function named inventory that annotates a queryset with some values. The statement current_inventory = SomeModel.objects.inventory(some_parameter) is going to give you that annotated queryset in current_inventory.

For more details and a more complete example, see Managers | Django documentation | Django

The easiest “rule of thumb” that I apply to this is to ask the question: “Can I take this app and install it in a different project by itself?” If I would need to bring along another app to have it work, then it’s probably not an app by itself. (Our standard is that an app may access data from another app, but all changes to models within an app must be performed in that app.)

For example, we have a timesheet application. People can charge time to tasks. The tasks and the people are defined elsewhere, but the work with entering and approving timesheets is solely located within that app. The app to assign people to tasks is a different app.

We can take the timesheet app and move it to a different project having a different mechanism for assigning people to tasks, and the timesheet app would work without modification.
You could also pre-assign people to tasks, and run the timesheet app without any task-assignment facility.
Likewise, you could use the task assignment app without any timesheets being created.

Therefore, these two apps (among others in the system) meet our criteria for using them as separate apps - and from our perspective, each does “one thing”.

I’ll also point out that the only reason we took the step of separating this project into multiple apps is due to it’s size. We start to look at a physical reorganization when individual files exceed about 2500 lines. (The total project size is currently > 110,000 lines, not counting tests and fixtures.)

For any of the smaller projects (< 20,000 lines), it’s strictly a single-app structure, unless we have an easily identifiable need to use a portion of the code in multiple projects - and that is relatively rare for these.

Hi!

So far I follow, but the same is true for Custom QuerySet, I also added a manager with the from_queryset, but it just seems like more work to make it chainable, but I do not know what the benefit are compared to the Custom QuerySet?

class CustomManager(models.Manager):
    def manager_only_method(self):
        return

class CustomQuerySet(models.QuerySet):
    def manager_and_queryset_method(self):
        return

class MyModel(models.Model):
    objects = CustomManager.from_queryset(CustomQuerySet)()

The difference is one of intent or usage.

Yes, the custom QuerySet methods more easily allow you to define “chainable” components for a query. The intent of a manager method, to the extent of my knowledge, is for the creation of a complete query. It’s a way of defining a complex query that may be used in multiple locations in your project, and may use multiple queryset methods for yielding results.

For example, say you need to create a “standard” or “common” query for “active” users, where there are many criteria to identify a user as being active. (Perhaps the current date being within some range, or membership in an active group, etc, etc.)

You could define each of those individual filters in custom queryset methods, but define the full query as a manager method that applies those defined filters.

In practice it can have the effect of reducing the amount of logic in views by providing that language in the model.

So, from a structural perspective for what you’re asking, and depending upon other requirements throughout your system, I could see creating each of those subordinate clauses as queryset methods, but providing a manager method to assemble the complete query.