I have a set of queries where at one point the db gets hit a lot. I know where the problem is but I can’t figure out how to fix, or make it more efficient.
My models are:
class Grade(models.Model):
score = models.CharField(max_length=4, blank=True)
assessment = models.ForeignKey(Assessment, on_delete=models.CASCADE)
objective = models.ForeignKey(Objective, on_delete=models.CASCADE)
student = models.ForeignKey(Student, on_delete=models.CASCADE)
cblock = models.ForeignKey(Classroom, on_delete=models.CASCADE)
time_created = models.DateField(default=timezone.now)
class Assessment(models.Model):
name = models.CharField(max_length=20, blank=True)
omit = models.BooleanField(default=False)
class Classroom(models.Model):
name = models.CharField(max_length=20, blank=True)
class Objective(models.Model):
name = models.CharField(max_length=20)
course = models.ForeignKey(Course, on_delete=models.CASCADE)
class Course(models.Model):
name = models.CharFied(max_length=20)
class Student(models.Model):
name = models.CharFied(max_length=20)
classroom = models.ForeignKey(Classroom, on_delete=models.CASCADE)
The problematic view is:
def gradebook(request, classroom_id, course_id):
classroom = Classroom.objects.get(pk=classroom_id)
objective_list = Objective.objects.filter(course=course_id).order_by('name')
student_list = Student.objects.filter(classroom=classroom_id).order_by('name')
for student in student_list:
for obj in objective_list:
q = Grade.objects.filter(student=student.id, objective=obj.id, assessment__omit="False").order_by('-time_created')
if q.exists():
if len(q) < 3:
qn = q.first()
else:
.....
q.exists() and qn=q.first() are repeating hits to the db. I don’t think I can prefetch_related for either of objective_list or student_list. For example, I tried:
But this gave me an error that 'grade' is an invalid parameter to prefetch_related(). I guess prefetch_related doesn’t work from the other side of of the foreignkey.
Is that the best I can expect to do? I think doing prefetch_related queryset on grade objects doesn’t help, because when I filter on the queryset it hits the db again?
Another idea I had was to get the values of grades before iterating:
And then in the iterations, do searches in grade_values list. This would reduce the 600 db queries to one, but there is the cost of doing 600 searches through the list of items. This number of searches would rarely go higher than 600, but the number of items in the list could be up to 5000 (typically 2000).
Is one method preferred over the other, or is this something that we try to measure (response times?) and decide from that?
Actually, you’re hitting the database 600 times because of this:
You’d be better off creating an appropriate query involving (some combination of) annotations, aggregations, and/or subqueries such that you’re not creating separate queries for each (objective, student).
Actually, you’re hitting the database 600 times because of this:
You’d be better off creating an appropriate query involving (some combination of) annotations, aggregations, and/or subqueries such that you’re not creating separate queries for each (objective, student).
I get all the objects I need with all_grades = Grades.objects.filter(cblock=classroom_id)
I use the iterations to mostly just produce an ordered list that I pass to the template, and the template goes through two for loops to fill a table (columns are objectives, rows are students, cells are the grades).