Async Performance

I spent some time today running performance numbers with various parts of the async stack loaded in.

My main concern is the speed impact we make on synchronous Django when this code lands - I don’t want sites that only use sync code to see a serious performance impact.

Sadly, right now we’re seeing a 10x performance slowdown on the request processing (from 0.6ms to 6ms). I did some playing around and the lowest we can possibly get for this (i.e. the cost of instantiating an event loop at all) is 0.8ms, which is not a huge increase and acceptable if we get down to it.

This is going to need some modification of asgiref and the complex code around threadlocals; now we have the ability to run sync code in the same thread as other sync code, a lot of it can likely be thrown away and replaced just with a safety catch that makes sure you don’t call them from async threads.

Otherwise, there’s the chance we’ll have to maintain two request paths in BaseHandler - one sync and one async. It’s not a huge amount of code, but it would be nice to avoid this.

2 Likes

Hi Andrew. I am reluctant to spend too much additional time on the async middleware until we have a plan for the overall performance concern.

In particular, it would be useful if we had some kind of light benchmarking tool developed that would show (1) the synchronous performance and (2) the asynchronous performance of: a tiny Django app with just a view function returning a fixed hello-world response with no other middlewares or models involved.

Such a tool would allow experimentation (and quick cycling) to get the basic baseline performance to a reasonable value. It would also be useful for me - and probably others on the async Django project - to ensure that PRs we’re putting together don’t regress performance unacceptably.

Since you detected this performance issue already, I suspect you may already have a leg-up on the creation of such a benchmarking tool. If not, then myself or others may be able to put one together once we can find some cycles.

1 Like

I think it may be a bit too early to focus on performance regressions. There is still a considerable global slowdown inherent to sync_to_async/async_to_sync which will probably need to be resolved independently of the work on the rest of the features.

I still wonder how the performance will look when we start comparing views that are more IO-bound (IIRC, the testes were mostly around an empty view).

It’s probably not too difficult to put together such a test with https://github.com/django/djangobench, but it would, of course, be easier if we can reuse what’s already been done.

I do indeed have a benchmarking tool in that I have a very basic Django view that returns a string, and I run it through a HTTP benchmarking tool in both ASGI mode and WSGI mode (as well as mutating various parts of the ASGI stack and how many sync_to_async calls there are, etc.)

I’ve not been able to do it for the last few weeks because of conferences and moving house, and that will continue for at least another week or two. I don’t mind if we had a few ms to each request in async mode, but the problem was that it was adding a lot of time to projects even in WSGI mode (as the WSGI path used a single async handler). We might need two handler paths for speed.