Hi guys,
I wanted to ask whether you think it’s a reasonable idea to add an option for performance warnings with regards to query generation to Django?
Motivating example:
So my Professor gave me the task to find out why Django created 239 sql-queries for a specific webpage. I eventually figured out that most of these queries where caused by accessing a ForeignKey attribute within the _init_() method of a model class.
I was new to Django and had no previous knowledge of descriptors in Python so it took me quite a while to figure this out.
Additionally as the definition of QuerySets and the evaluation of QuerySets happens in different locations this can also makes it hard to figure out where a select_related/prefetch_related is missing (Especially if for example the QuerySet is defined in a view method and the related object access happens in a method defined within a model class).
This led me to thinking about whether Django could make it easier for Developers to debug these performance issues so I tried implementing warnigs for this. Here is what I came up with:
I added an option for enabling performance warnings within the settings file of the project
PERFORMANCE_WARNINGS = False
Within the related_descriptors file I added the warnings (Using Python warnings) within all of the _get_() methods. E.g. from the _get_() method of the ForwardManyToOneDescriptor:
def __get__(self, instance, cls=None):
"""
Get the related instance through the forward relation.
With the example above, when getting ``child.parent``:
- ``self`` is the descriptor managing the ``parent`` attribute
- ``instance`` is the ``child`` instance
- ``cls`` is the ``Child`` class (we don't need it)
"""
if instance is None:
return self
# The related instance is loaded from the database and then cached
# by the field on the model instance state. It can also be pre-cached
# by the reverse accessor (ReverseOneToOneDescriptor).
try:
rel_obj = self.field.get_cached_value(instance)
except KeyError:
#Added code
if settings.PERFORMANCE_WARNINGS:
warnings.warn("\nPerformance Warning: A database query was created when trying to access attribute '{}' in the model \n{}\n relating to the model \n{}.\n This is either the case because '{}' is accessed within the __init__method of \n{}\n or because there was no select_related()/prefetch_related() with '{}' as parameter.".format(self.field.name, instance.__class__, self.field.related_model, self.field.name, instance.__class__, self.field.name)
, stacklevel=2)
#
The reasoning here is that there is only a cache miss in _get_() in two cases if I am not mistaken:
Either select_related()/prefetch_related() was not used on this attribute and/or the attribute was accessed within the _init_() method of the model meaning that this warning makes sense here.
As the developer might want the query to happen here the PERFORMANCE_WARNINGS setting is false by default.
When I turn the performance warnings on whithin the motivating example this is what the terminal prints out:
/home/christoph/Studium/Bachelorarbeit/Django/mastersystem/mapsys2/core/models/master_application_system.py:363: UserWarning:
Performance Warning: A database query was created when trying to access attribute 'status' in the model
<class 'core.models.master_application_system.StudyProgramChoices'>
relating to the model
<class 'core.models.master_application_system.ApplicationStatus'>.
This is either the case because 'status' is accessed within the __init__method of
<class 'core.models.master_application_system.StudyProgramChoices'>
or because there was no select_related()/prefetch_related() with 'status' as parameter.
if self.status is None:
/home/christoph/Studium/Bachelorarbeit/Django/mastersystem/mapsys2/core/models/master_application_system.py:1269: UserWarning:
Performance Warning: A database query was created when trying to access attribute 'overall_grade' in the model
<class 'core.models.master_application_system.HumanReviews'>
relating to the model
<class 'core.models.master_application_system.Grades'>.
This is either the case because 'overall_grade' is accessed within the __init__method of
<class 'core.models.master_application_system.HumanReviews'>
or because there was no select_related()/prefetch_related() with 'overall_grade' as parameter.
self.__original_overall_grade = self.overall_grade
/home/christoph/Studium/Bachelorarbeit/Django/mastersystem/mapsys2/core/models/master_application_system.py:910: UserWarning:
Performance Warning: A database query was created when trying to access attribute 'university' in the model
<class 'core.models.master_application_system.DegreesObtained'>
relating to the model
<class 'core.models.master_application_system.UniversitiesWorld'>.
This is either the case because 'university' is accessed within the __init__method of
<class 'core.models.master_application_system.DegreesObtained'>
or because there was no select_related()/prefetch_related() with 'university' as parameter.
"university": self.university.name if self.university else None,
/home/christoph/Studium/Bachelorarbeit/Django/mastersystem/mapsys2/core/models/master_application_system.py:914: UserWarning:
Performance Warning: A database query was created when trying to access attribute 'subject' in the model
<class 'core.models.master_application_system.DegreesObtained'>
relating to the model
<class 'core.models.master_application_system.StudySubjects'>.
This is either the case because 'subject' is accessed within the __init__method of
<class 'core.models.master_application_system.DegreesObtained'>
or because there was no select_related()/prefetch_related() with 'subject' as parameter.
"subject": self.subject.subject_name if self.subject else None
This would definitely be very helpful for debugging especially for Developers who are not experts in Django as these warnings directly tell you which line of code caused an avoidable query and how this might be fixed.