Is DEP009 ("async-capable Django") still relevant?

rtpg · October 2, 2024, 5:06am

I have been working on trying to just add support for async cursors from postgresql, and I’ve unfortunately hit what I believe to be a major design challenge with the save/asave strategy on Model in particular.

Currently, asave just calls into save. So Model subclasses with overwritten save methods work fine.

If we want async saves “for real”, the straightforward way to do so is for asave to not call into save (which can block). But if we make that change, suddenly there’s this behavior divergence between calling save or asave with existing code!

This is of course true for overwriting more or less any method in the stack here. But save it feels acute because even in my test app I am using the venerable TimeStampedModel which… overwrites save!

    def save(self, *args: Any, **kwargs: Any) -> None:
        """
        Overriding the save method in order to make sure that
        modified field is updated even if it is not given as
        a parameter to the update field argument.
        """
        update_fields = kwargs.get('update_fields', None)
        if update_fields:
            kwargs['update_fields'] = set(update_fields).union({'modified'})

        super().save(*args, **kwargs)

So in a world where asave doesn’t call into save, what is the transition strategy for TimestampModel? And how do we avoid exporting the function coloring problem to users?

In a world where asave calls save, what magic could we use to have multiple concurrent pieces of work?

Right now, calling aX functions will quickly get you into a sync_to_async(X) call. So async calls all will quickly get pointed “back” to their sync variants.

This means that, roughly, every time I do a top-level async operation I will pay one sync_to_async cost. I think it’s important here to say that the cost doesn’t compound on deeper trees, given that you end up “in sync mode” for the rest of your call tree.

If we inverted things, so that calling X calls aX, then we instead pay one async_to_sync cost per top-level operation.

But if we have X call aX instead of the contrary, we end up in an interesting place. Overrides for save “just work”, and it’s only people calling asave who need to figure things out.

But the cost to sync mode might be too much? In that case it might make sense to have two model variants, models.Model and models.AsyncModel. In Model, save is the “canonical” version. In AsyncModel, asave is the canonical version.

Of course if you wanted TimeStampedModel to support both, you need to implement save and asave… so you start getting into things like "I’m using signals for things like TimeStampedModel (or we add hooks for pre-save/post-save as model methods…). But if Model subclasses get a hold of the thread of control here for I/O, we’ve already kind of lost.

On a higher level, I do think that saying something like “users should only need to override save as a last resort” (in the same way that the ORM has raw for last resorts), because then people could move code over to strategies that don’t require controlling the thread of execution. And in that model you could outright have semi-automated checks to help people figure out why their models might not be async-performant.

If we had def prep_save(self, **save_kwargs) and def do_post_save(self, instance, **save_kwargs) in the model API, that might be enough to avoid most save operation requirements… though then we again hit the function color problem.

But with such hooks inserted now, it would be, IMO, reasonable to say something like “we’ll look at whether the hook is a coroutine or not, and pay a sync/async transition cost if needed”. Or have some other magic that would let people write async-agnostic hook implementations.

After having written all of this, not clear to me how much custom save methods come into play (often, custom processing people do tends to be in forms, DRF serializers etc…), but there might simply be a thing of saying “you gotta override save and asave now”. I do think it will likely require some sort of transition release to help people figure out what’s needed.

EDIT: Did notice this post offering one idea of using a context manager to indicate you really do want async-ness. There might be a way to combine this idea with my concerns to make it possible to flag where some overwritten methods might be lurking

Topic		Replies	Views
Asyncifying django.contrib.auth? (and signals, and maybe sessions) Async	35	4626	October 8, 2024
Maybe it's just a blind spot, when it comes to async django Async	22	4517	May 5, 2024
DEP0009/ORM implementation plan Async	14	498	December 5, 2024
Current Async TODOs Async	20	5307	August 14, 2020
Vinyl, a django-based async orm Async	0	1019	December 9, 2021

Is DEP009 ("async-capable Django") still relevant?

Related topics