DRF and nested serialisers optimisation with prefect_related()

Hi All,

I’m working on a project where I am using Django and DRF. I’m trying to optimise an API call which is quite slow and in an effort to reduce the querytime, I’m using select_related() and prefect_related() in the get_queryset() of my view.

The view heavily utilises nested serializers; I’m effectively returning all related objects to the model which I’m fetching in the view.

My view returns data as expected, but by using Django Debug Toolbar I see that the number of queries to the DB is the same regardless of whether I use the prefetch functionality or if I comment out all the prefetch functionality.

What I’m failing to understand is:

  • Am I using the prefetch functionality correctly?
  • Am I using the prefetch functionality in the correct place, i.e. the view
  • Perhaps my database design or this particular query doesn’t lend itself to prefetching.
  • Perhaps I could improve performance by using DB indexes, but I’m not entirely sure which is the most efficient spots to use them.

So ultimately my questions are: am I using prefetch correctly and why does using prefetch in the view’s get_queryset seems to have no effect.

Below is my view and my serializer.

View

class NewCaseAPIView(generics.RetrieveAPIView):
    serializer_class = EntireCaseSerializer
    permission_classes = [IsAuthenticated]
    case = None
    species = None

    def get_queryset(self):
        self.case = self.kwargs["id"]
        self.species = self.kwargs["species"]
        species = Species.objects.get(species=self.species)
        queryset = Case.objects.all().filter(id=self.case, species=species)
        queryset.select_related("breed_set")
        queryset.select_related("species_set")
        queryset.select_related("fields_set")
        queryset = queryset.prefetch_related("question", "question__answers")
        queryset = queryset.prefetch_related(
          "pop_quizes", "pop_quizes__pop_quiz", "pop_quizes__pop_quiz__pop_quiz_answers"
        )
        queryset = queryset.prefetch_related("history_categories", "history_categories__history")
        queryset = queryset.prefetch_related("differential")
        queryset = queryset.prefetch_related("treatment")
        queryset = queryset.prefetch_related("physicals")
        queryset = queryset.prefetch_related("reading")
        queryset = queryset.prefetch_related("inappropriatetest_set")
        queryset = queryset.prefetch_related(
            "diagnostic",
            "diagnostic__categories",
            "diagnostic__categories__sub_categories",
            "diagnostic__tests",
            "diagnostic__tests__parameters",
            "diagnostic__tests__parameters__parameters",
            "diagnostic__tests__parameters__parameters",
        )
        return queryset

    def get_object(self):
        return get_object_or_404(self.get_queryset(), id=self.case)

Serializer

class EntireCaseSerializer(serializers.ModelSerializer):
    """
    Return all case related objects for the start of a case
    """

    history_categories = HistoryCategorySerializer(many=True, read_only=True)
    differential = CensoredDifferentialSerializer(many=True, read_only=True)
    treatment = CensoredTreatmentSerializer(many=True, read_only=True)
    diagnostic = CaseDiagnosticSerializer()
    physicals = PhysicalSerializer(many=True, read_only=True)
    pop_quiz_questions = PopQuizQuestionSerializer(many=True, read_only=True)
    breed = BreedSerializer()
    species = SpeciesSerializer()
    image = ImageSerializer()
    fields = FieldSerializer(many=True, read_only=True)
    has_quiz = serializers.SerializerMethodField()

    class Meta:
        model = Case
        fields = (
            "id",
            "uuid",
            "total_points",
            "passing_points",
            "budget",
            "starting_health",
            "name",
            "history_time",
            "problem",
            "image",
            "history_categories",
            "differential",
            "treatment",
            "diagnostic",
            "physicals",
            "pop_quiz_questions",
            "pop_info",
            "age_years",
            "weight",
            "gender",
            "owner",
            "breed",
            "castrated",
            "difficulty",
            "species",
            "fields",
            "has_quiz",
        )

    @staticmethod
    def get_has_quiz(obj):
        return Question.has_case_quiz(obj)

Because you’re in a RetrieveAPIView, you’re looking at getting information for just one case. As a result, using prefetch_related() won’t buy you any query savings because it still has to look up all entries from each of those related models individually.

However, there are a few things you could do to make things better:

  1. You forgot to assign the results of your three select_related() calls back to queryset. That might be a copy/paste bug though.
  2. Assuming pop_quiz is a ForeignKey or OneToOneField from your pop quizzes model, then you can use a django.db.models.Prefetch object to select the pop_quiz alongside the pop_quizes:
from django.db.models import Prefetch

queryset = queryset.prefetch_related(Prefetch('pop_quizes', queryset=models.PopQuiz.objects.select_related('pop_quiz').prefetch_related('pop_quiz__pop_answers')

You get the big savings on list views, where instead of having to do ~18 queries for each row returned, you have to do ~18 queries total.

Hi Drew,

Thank you for you the help. Fired up Prefetch and got the cogs turning again. You were correct though, I can’t seem to get a lot of improvement on the get() lookup. Lesson learnt.

And thanks for spotting the typos! That helped!

Cheers,

Conor