The Trouble With Middleware

andrewgodwin · September 9, 2019, 1:35am

So, the current sticking point I have with the async work is middleware - specifically, synchronous middleware.

The design of Django’s “new style” middleware - a callable that calls another callable - means that the context of the middleware has to stay open while the view runs. I have synchronous middleware adapting around asynchronous views just fine, but this does mean we waste a whole synchronous thread per async view call, which defeats the point of having async in the first place, really.

I can’t think of an easy way out of this; so far, the only options I can consider (neither of which are good) are:

Rewrite all the basic Django middleware to be async and tell people not to use non-async middleware if they want massive parallelism (throwing anyone with non-standard middleware under the bus)
Somehow pause the sync middleware and suspend execution in a way where we can come back to it (not even sure this is possible)

Alternative suggestions of how to approach this would be most welcome…

manfre · September 10, 2019, 2:06pm

Rewriting all the Django provided middleware to support async doesn’t really throw anyone under the bus. It provides a few examples to reference when updating their own and until that happens, they can keep running their current sync infrastructure until that happens. Those who need the extra parallelism will put forth the effort to take advantage of it. Extra documentation on some common patterns for moving sync to async could also help.

This sounds like it the best case scenario it would add a lot of complexity or more moving parts.

andrewgodwin · September 10, 2019, 4:27pm

Yeah, I’m not averse to rewriting the Django middleware, it just makes the whole thing a much bigger effort until it’s properly useable. I’m going to probably sit down and play with the suspension idea at DjangoCon US in a couple of weeks, when there’s some more talented minds I can steal ideas from!

patrys · September 11, 2019, 6:43am

I think rewriting everything is what we actually want in the long run. Existing projects with custom middleware won’t get any faster but also won’t get any slower.

nicolaslara · September 11, 2019, 11:17am

I generally agree with this. Particularly since, while having the view run within an open thread doesn’t improve concurrency across views, it does allow users to take advantage of async within the view (cache, ORM, templates, etc)

When adapting third-party sync middlewares, we could also raise a warning and point to the documentation on how to port them to incentivize users to rewrite them.

I would suggest adding configuration to specify the behaviour of the middleware processing. An example:

MIDDLWARE_BEHAVIOUR='async' (default). All middlwares must be async, raise an error if they aren’t
MIDDLWARE_BEHAVIOUR='adapt'. Automatically adapt the middlewares, raise a warning as described above and explain the consequences for concurrency
MIDDLWARE_BEHAVIOUR='suspend_sync'. Whatever the sync suspension magic does.

I’m not sure if there are other behaviours that would make sense here.

As for pausing the sync middleware, this can probably be achieved via AST manipulations, but if __init_subclass__ was considered ugly monkeypatching, then this is way off the mark.

andrewgodwin · September 11, 2019, 4:11pm

Yeah, I am not expecting this to be pretty which is why I’m not assuming we’ll use it (even if we can pull it off).

Your proposed idea of how to adapt middleware is nice - I like the “explicit failure if there’s non-async middleware” mode. That could make rewriting them more palatable. And, as you mention, merely having an async context is worth something, even if it does consume a thread.

davidfstr · September 26, 2019, 11:08pm

I have done research on all the default middlewares that come with a new Django project. Most of them look like they can be entirely be rewritten as natively async, with a few exceptions. Details below.

√ = can be rewritten as fully async
= has parts that need to be sync under certain conditions

SecurityMiddleware √
SessionMiddleware
- session get - may database query
- session save - may database query
CommonMiddleware √
CsrfViewMiddleware
- render template (only if request rejected) - template render is sync
- session set - ok for built-in but not for 3rd-party
AuthenticationMiddleware √
- HttpRequest.user = SimpleLazyObject(hits database)
MessageMiddleware
- message storage load - may session lookup - ok for built-in but not for 3rd-party
- message storage update - may session set - ok for built-in but not for 3rd-party
XFrameOptionsMiddleware √

Based on this research, I’ll move to rewrite all the middlewares with √ as async, altering MiddlewareMixin (which is use by all the above middlewares) to support old-style middlewares that are async.

davidfstr · September 27, 2019, 8:34pm

I now have a branch async_middleware, based off the tip of Andrew’s async_views, that:

alters MiddlewareMixin to export an async interface and support wrapping old-style middleware classes (which can be now async in addition to sync),
changes all of the built-in middlewares mentioned in the previous post to be async.

More work is still needed:

[.] Test suite fails

[x] generic_views.test_dates.ArchiveIndexViewTests.test_archive_view_invalid - Fix make_middleware_decorator() to support async old-style middleware classes.
[.] flatpages_tests.test_csrf.FlatpageCSRFTests.test_fallback_flatpage - Fix “django.db.utils.OperationalError: database table is locked: django_site”. Anybody know what this error means?
(… probably more …)

[ ] Add more tests to do things like running the standard middleware stack with 3rd party session backends, message storage backends, etc that are @async_unsafe. Fix any issues identified.

[ ] Documentation for MiddlewareMixin should be extended to show how it now supports mixing in to async middleware classes. Also show caveats in upgrading older users of MiddlewareMixin, who must now call super().init(…) properly.

griff_rees · September 27, 2019, 9:35pm

Hi. Mostly a note to myself to remember what I’m working on but: perhaps my work on adding async methods to the Client class may be helpful…

andyide · September 29, 2019, 10:55pm

I just want to again thank all the folks working on the async functionality.

It is appreciated.

andrewgodwin · January 28, 2020, 8:32pm

Just thought I’d come back here and cap this off with the news that a redesign of how middleware operates lets a piece of middleware be both async- and sync-capable simultaneously, making this problem very tractable! We can likely port all of Django’s shipped middleware to this model without too much effort.

andyide · January 28, 2020, 9:49pm

Very good news indeed! Another monkey off your back!

Damn monkeys!

JonasKs · January 29, 2020, 7:52am

Interesting! Please post when it’s ready to be viewed by the public. I need to rewrite mine to be async too.

allen-munsch · July 10, 2020, 7:14pm

For anyone else who finds their way to this thread. Saw some notes related to how the sync/async issue with middleware might be handled in 3.1 here: https://docs.djangoproject.com/en/3.1/topics/http/middleware/#async-middleware

JonasKs · July 13, 2020, 6:36am

Nice! I’m a maintaner of a middleware and need to look more into this soon. Has some of the Django shipped middlewares been rewritten for async?

Last time I tested there were also only one request per thread, has this been changed to multiple requests per thread now?

andrewgodwin · July 13, 2020, 11:07pm

The Django middlewares have been made async-compatible so they only use a thread on the way in and out, rather than keeping it open the entire request, but they’re not fully async.

Full-async mode allows as many requests per thread as your CPU can handle, but it’s still the case that if you bring a sync middleware in that’s totally incompatible that it takes one request per thread
(Python forces that on us). With the async-aware middleware that’s still running things in threads for handling requests/responses, though, I think it should be able to fit quite a few requests per thread, but I’d need to go check.

JonasKs · August 4, 2020, 10:05am

The Django middlewares have been made async-compatible so they only use a thread on the way in and out, rather than keeping it open the entire request, but they’re not fully async.

I’m not sure if I fully understand this. What do you mean it only uses a thread on the way in and out? Is the request object passed over to another thread for the views? Or is the entire request just handled on a thread, but has the async context set up for you? (This is already a huge thing obviously, just want to ensure I understand correctly)

Full-async mode allows as many requests per thread as your CPU can handle, but it’s still the case that if you bring a sync middleware in that’s totally incompatible that it takes one request per thread
(Python forces that on us).

In order to make the Django middlewares fully async, the ORM needs to be async first - right? So at the moment no middlewares are 100% async?

In addition to this, I’ve also been a bit confused about why Django keep using the deprecated MiddlewareMixin? My understanding is that we shouldn’t use this at all, yet I see it’s been updated for async support.

I’m happy to take these questions elsewhere if you feel that it would do better in another forum section or over the maillist.

andrewgodwin · August 4, 2020, 10:32pm

So, what it means is that a synchronous thread is used for the short calls of handling the request and response, but unlike the naive solution, is not held open while the main view runs. This means a single thread can service many middlewares on many concurrent requests.

MiddlewareMixin is still around in Django because all the built in middleware uses it - I don’t claim to have all the answers why, but it was a very convenient single point to upgrade and fix every single middleware rather than patching each middleware individually.

andrewgodwin · August 4, 2020, 10:35pm

Oh, and with regards to the ORM - yes, the middlewares are not fully async, they are merely async compatible. Some middlewares do no DB access, though, and those could be made fully async right now if we wanted (but they all just use a sync thread for the moment because I didn’t want to poke too many wasps’ nests at once.

hendrikfrentrup · October 7, 2020, 6:34am

I hope this fits into the discussion about “trouble with middleware”, it’s about the HTTP view decorators (I think it’s actually closer to the Django core than middleware, but lines are blurry sometimes, no?) The @require_http_methods decorator doesn’t work with async views and I think it certainly should since it doesn’t touch the ORM (unlike the @login_required)

I don’t want to just mention the bug I filed about this, but rather point to a bit of experimenting I did to try to fix it, which you can see here. It seems to solve the problem by awaiting the coroutine for an async view and return without awaiting for sync views.

I have not contributed to the Django project before and I’d be keen to contribute this as a patch. So, I am planning to write a test for the async decorators (sync tests are passing). Anything else to consider? All feedback appreciated.

Topic		Replies	Views
sync_to_async() called 14 times with default Django middleware, and 2 times with no middleware Async	2	850	June 6, 2024
Making a middleware async capable Using Django	0	708	December 22, 2020
Async Performance Async	16	10421	October 13, 2020
Maybe it's just a blind spot, when it comes to async django Async	22	4947	May 5, 2024
Django async views Using Django	3	1062	September 21, 2021

The Trouble With Middleware

Related topics