I have been working on trying to just add support for async cursors from postgresql
, and I’ve unfortunately hit what I believe to be a major design challenge with the save
/asave
strategy on Model
in particular.
Currently, asave
just calls into save
. So Model
subclasses with overwritten save
methods work fine.
If we want async saves “for real”, the straightforward way to do so is for asave
to not call into save
(which can block). But if we make that change, suddenly there’s this behavior divergence between calling save
or asave
with existing code!
This is of course true for overwriting more or less any method in the stack here. But save
it feels acute because even in my test app I am using the venerable TimeStampedModel
which… overwrites save
!
def save(self, *args: Any, **kwargs: Any) -> None:
"""
Overriding the save method in order to make sure that
modified field is updated even if it is not given as
a parameter to the update field argument.
"""
update_fields = kwargs.get('update_fields', None)
if update_fields:
kwargs['update_fields'] = set(update_fields).union({'modified'})
super().save(*args, **kwargs)
So in a world where asave
doesn’t call into save
, what is the transition strategy for TimestampModel
? And how do we avoid exporting the function coloring problem to users?
In a world where asave
calls save
, what magic could we use to have multiple concurrent pieces of work?
Right now, calling aX
functions will quickly get you into a sync_to_async(X)
call. So async calls all will quickly get pointed “back” to their sync variants.
This means that, roughly, every time I do a top-level async operation I will pay one sync_to_async
cost. I think it’s important here to say that the cost doesn’t compound on deeper trees, given that you end up “in sync mode” for the rest of your call tree.
If we inverted things, so that calling X
calls aX
, then we instead pay one async_to_sync
cost per top-level operation.
But if we have X
call aX
instead of the contrary, we end up in an interesting place. Overrides for save
“just work”, and it’s only people calling asave
who need to figure things out.
But the cost to sync mode might be too much? In that case it might make sense to have two model variants, models.Model
and models.AsyncModel
. In Model
, save
is the “canonical” version. In AsyncModel
, asave
is the canonical version.
Of course if you wanted TimeStampedModel
to support both, you need to implement save
and asave
… so you start getting into things like "I’m using signals for things like TimeStampedModel
(or we add hooks for pre-save/post-save as model methods…). But if Model
subclasses get a hold of the thread of control here for I/O, we’ve already kind of lost.
On a higher level, I do think that saying something like “users should only need to override save
as a last resort” (in the same way that the ORM has raw
for last resorts), because then people could move code over to strategies that don’t require controlling the thread of execution. And in that model you could outright have semi-automated checks to help people figure out why their models might not be async-performant.
If we had def prep_save(self, **save_kwargs)
and def do_post_save(self, instance, **save_kwargs)
in the model API, that might be enough to avoid most save
operation requirements… though then we again hit the function color problem.
But with such hooks inserted now, it would be, IMO, reasonable to say something like “we’ll look at whether the hook is a coroutine or not, and pay a sync/async transition cost if needed”. Or have some other magic that would let people write async-agnostic hook implementations.
After having written all of this, not clear to me how much custom save
methods come into play (often, custom processing people do tends to be in forms, DRF serializers etc…), but there might simply be a thing of saying “you gotta override save
and asave
now”. I do think it will likely require some sort of transition release to help people figure out what’s needed.
EDIT: Did notice this post offering one idea of using a context manager to indicate you really do want async-ness. There might be a way to combine this idea with my concerns to make it possible to flag where some overwritten methods might be lurking