Hi,
We have a django-based project and recently decided to up our django dependency up from 3.2 to 4.2, but discovered some test failures and I’d like to learn more about the change in saving behavior between the 2 versions.
I have implemented work-arounds/fixes for the new save behavior and I believe I understand the change. I can also understand potentially why the change was made, even though I disagree with it. But first, let me first see if my understanding is correct. I should probably start with what my code does that 4.2 broke…
Background
I actually have 2 Model superclasses that raise exceptions in 4.2, (HierCachedModel
and MaintainedModel
). The fix for both classes is simple, but since the fix for MaintainedModel
involves custom functions in the derived class, which may not behave as a developer intends, I’m going to focus on MaintainedModel
. The basic idea behind MaintainedModel
is, if your Model
isa
MaintainedModel
, then you can create update methods for specific model fields that you can decorate as “updater” methods. In the decorator, you can also identify related Model
classes whose changes should trigger an update to the maintained field. The superclass overrides the model’s save()
method (along with other methods) to catch and trigger/propagate these changes. It makes updates to fields before calling super().save()
.
The Error
When we updated Django, a number of tests threw exceptions such as:
ValueError: 'Animal' instance needs to have a primary key value before this relationship can be used.
The problem stemmed from:
- calls to update methods that are implemented in the derived class’s updater methods which query related records
- traversals of relations implemented in
MaintainedModel
to trigger updates to related records
My Understanding
So I get what’s happening. In the above error case, the updater method is making a query on self.samples.filter(...
and since the animal object is being created and Sample
has a foreign key into Animal
, self.samples
is null (i.e. it doesn’t exist), hence an exception (inferred from the fact that pk
doesn’t exist).
In django 3.2, it didn’t throw an exception, it just returned an empty queryset. In fact, if you called .count()
on it, it returned 0
. That made sense to me, because if you want to know how many related records exist, or you wanted a list of related records, you could answer that question (because the relations didn’t exist) and return either 0 or an empty list. I.e. if relations haven’t been set up, you can still respond to those queries.
Now however, if a developer using my class wants to update a field named “last_serum_sample”, they would have to handle the null case, where samples
is null (or rather, infer that from pk
not existing).
My quick fix for this issue was to catch that exception in the superclass and just not perform the update. I.e. I could assume that if the value being computed relies on a non-existent relation, because the record is just now being created, then I can deduce that there’s no valid value to enter into the field.
But their computed value may also include data from the object (i.e. not transiting a relation) or it could include data from e.g. Sample.objects.filter(...
or any other computation that would not throw an exception in 4.2. So ignoring that update may throw away valid values.
My Question
I hate to require developers using my class to have to complicate their updater method code to test whether the primary key exists, or for me to clutter the log output with warnings about un-auto-updated fields.
I thought about possibly using “post-save” functionality. It would make my base class code more complex, but I also anticipate issues with saving updates in a post-save. Would that even be allowed, like if I use update_fields
to possibly avoid an IntegrityError
from a second save call, especially if the result of the developers update method results in no change?
Or is there a way to tell Django to behave the way it did in 3.2 for the duration of the save and return empty querysets?