Optimization of model formsets that have modelchoice fields

Let’s say I have a formset that consists of N forms that have ModelChoiceField. By using django debug toolbar I saw that Django makes N dublicating requests for each form to get available options for some field. It doesn’t make sence. I can queryset only once and set results for each same modelchoicefield. I saw one technique in one of YouTube videos in which author explains how to get rid of N - 1 queries (leave only 1 query) to fill all the same ModelChoiceFields (here is the link with timecode video. I followed the similar approach on my project like this:

from django import forms
from django.core.exceptions import ValidationError
from django.forms import inlineformset_factory, BaseInlineFormSet

from src.system_settings.models import Tariff, TariffService, Service


class ServiceSelect(forms.Select):
    def create_option(self, name, value, label, selected, index, subindex=None, attrs=None):
        option = super().create_option(
            name, value, label, selected, index, subindex, attrs
        )
        if value:
            option["attrs"]["data-unit"] = value.instance.unit.unit
        return option


class AdminTariffServiceForm(forms.ModelForm):
    price = forms.DecimalField(
        required=False,
        widget=forms.NumberInput(attrs={'class': 'form-control', 'min': 0.0, 'step': 0.1}),
        label='Ціна'
    )

    def clean(self):
        has_price = bool(self.cleaned_data.get('price'))
        has_service = bool(self.cleaned_data.get('service'))

        if has_price != has_service:
            raise ValidationError('Не вказано ціну або не обрано послугу')

    class Meta:
        model = TariffService

        widgets = {
            'service': ServiceSelect(attrs={'class': 'form-control'}),
        }

        labels = {
            'service': 'Послуга',
        }

        fields = ('service', 'price')

    def __init__(self, *args, **kwargs):
        service_choices = kwargs.pop('service_choices', [])

        super(AdminTariffServiceForm, self).__init__(*args, **kwargs)

        self.fields['service'].choices = service_choices


class BaseAdminTariffServiceFormSet(BaseInlineFormSet):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)

        qs = Service.objects.select_related('unit').all()

        self.service_choices = [*forms.ModelChoiceField(
            queryset=qs,
            empty_label='Виберіть...',
        ).choices]

    def get_form_kwargs(self, index):
        kwargs = super().get_form_kwargs(index)
        kwargs['service_choices'] = self.service_choices
        return kwargs


AdminTariffServiceFormSet = inlineformset_factory(
    parent_model=Tariff,
    model=TariffService,
    form=AdminTariffServiceForm,
    formset=BaseAdminTariffServiceFormSet,
    can_delete=True,
    can_delete_extra=True,
    extra=1,
)

The problem is it doesn’t prevent me from doing N DB requests for validating of each of the ModelChoiceField (check if entry with selected Id exists in table) and creating N requests for saving entire formset. Is there a way to save all the formset with only 1 request (By using bulk_create + handle potential errors by catching IntegrityError if ForeignKey does not exist)?

Welcome @vlad1m1r0v !

First, before spending a lot of time trying to address this, have you determined that, in practice, it is actually a problem?
I’m not talking about the theoretical issues - I’m talking about checking for a real-world difference in the effective performance of your application in a production-deployment environment. (Unfortunately, because of the overhead that DDT adds to the processing, it is not a good tool for determining this. You’d need to test this out-of-band to get a proper evaluation.)

I think you’d find that, between Django’s internal cache and your database’s cache, those repeated queries are very fast.

(We’ve determined in our case, with N generally being less than 10, that it’s a non-issue.)

Side note: The current behavior makes sense, because you have the option within a formset to customize the individual forms on a per-form basis. There is nothing that requires the formset to use the same query for every form instance.

But yes, you could create a subclass with the behavior you identify.

I mean if Django cached these requests, then it would fetch results only once and then for the next 2…N forms it would get results from queryset _result_cache. Here it is not a case as ModelChoiceField under the hood always calls .all() method of a provided queryset for
each ModelChoiceField. My observations showed that the execution time was reduced by three times thanks to the approach from the video (I reduced number of queries from 14 to just 1). Expected number of forms in formset is ~100, so I am also thinking about optimisation of save method as inserting multiple rows in a single statement is often more efficient than multiple individual inserts because it reduces the number of round-trips between the application and the DB server and I also don’t want to make redundant N requests for validation ForeignKey fields. If a proccess is failed due to InegrityError, I can handle it by myself (raise ValidationError, then this error i can show in template by getting it from messages…).
Thanks for your reply regardless.

One of the problems here is that you are running this with DEBUG=True (DDT doesn’t show up otherwise), and you’re likely running this under runserver (which is far from being a production-quality server), and you’re evaluating this with DDT - which is itself affecting your timing of the results.

You cannot make the determination that this is a problem or an issue needing to be addressed until you have benchmarked this in a real environment. Right now, this is all conjecture based on (potentially) misleading information, leading you to expending effort that may not (in real terms) be necessary.

The very famous quote from Donald Knuth applies here:

“Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.”

There’s a good chance that you have spent more time worring about this already than you will ever save over the lifespan of your project.

On the other hand, if this really does fit into that “3%” of truly time-critical operations, then you would be best served by creating your own Formset subclass containing those optimizations.