I have been testing my backend based on Django and Django Rest Framework. I wanted to check what is the minimal possible response time for my REST services. I was doing testing with ASGI and WSGI configuration. Difference between ASGI and WSGI shocked me completely.
There is ~15ms overhead for the same operation (same view) when using ASGI - laboratory conditions with one client and one request processing just to show minimal latency potential of Django (not handling multiple requests at the same time, to demonstrate the power of ASGI). I compared results obtained for identical requests in ASGI and WSGI modes.
My test environment was a dedicated physical machine which at a given time was used only for the testing purposes, having a lot of free RAM, free CPUs resources, local NVMe storage, and LAN connection to a client workstation.
I have performed testing for various configuration scenarios, however, there has always been a huge overhead for the ASGI requests. I checked different python versions (3.9, 3.10, 3.11 up to 3.12), Django version (4.1, 4.2 up to 5), asgiref (3.6 to 3.8), as well as different versions of gunicorn and uvicorn/daphne, with/without DEBUG…
The difference was still ~15ms +/-2ms, so I decided to utilize Django version 5.0.4 and the latest stable versions of other python packages.
Below is the view witch I used during the tests (returns just a static value without any database connection):
class TestView(APIView):
def get(self, request, *args, **kwargs):
response_data = {'message': "Test Message"}
return Response(response_data, status=status.HTTP_200_OK)
Results (I use wrk and also locust to do the testing):
WSGI:
gunicorn message_broker.wsgi:application --workers 4 --bind 0.0.0.0:8000
avg ~ 1ms
ASGI:
gunicorn message_broker.asgi:application --workers 4 --worker-class uvicorn.workers.UvicornWorker --bind 0.0.0.0:8000
avg ~ 17ms
I was trying to find out what was the cause of the observed difference resorting to django-cprofile-middleware. Adding django-cprofile-middleware to the end of MIDDLEWARE configuration (in settings.py) show me the same total time ~1ms for both ASGI and WSGI, so there was no difference on the django-cprofile-middleware level.
I know that the optimal solution for ASGI would be to use asynchronous view, in case of which standard coroutines would be used (not a new thread to emulate coroutine by sync_to_async). In my case ASGI and standard Django Rest Framework synchronous view is used.
I am also aware that for a synchronous view and ASGI, Django runs the view’s synchronous code in a thread pool using ThreadPoolExecutor. It works fine, I have analysed details with extended logging (logging process, thread, threadname) and using PYTHONASYNCIODEBUG=1. It was looking good for me - same process, same thread, creating new thread pool number (view code is running inside ThreadPoolExecutor-NNN_0 for each request - different pool number NNN). I also have checked if there was an issue with middleware not supporting asynchronous, as suggested by Django documentation:
Middleware can be built to support both sync and async contexts. Some of Django’s middleware is built like this, but not all. To see what middleware Django has to adapt for, you can turn on debug logging for the django.request logger and look for log messages about “Asynchronous handler adapted for middleware …”.
Nothing is logged by class BaseHander adapt_method_mode() method so there is no problem with many/extra switches between async and sync.
My MIDDLEWARE configuration is as follows:
MIDDLEWARE = [
'django.middleware.security.SecurityMiddleware',
'django.contrib.sessions.middleware.SessionMiddleware',
'django.middleware.common.CommonMiddleware',
'django.middleware.csrf.CsrfViewMiddleware',
'django.contrib.auth.middleware.AuthenticationMiddleware',
'django.contrib.messages.middleware.MessageMiddleware',
'django.middleware.clickjacking.XFrameOptionsMiddleware',
]
I read documentation and watched conferences about ASGI/internals to find an answer to this issue. I read and searched for a cause of the problem in django/core/handlers source codes (asgi.py, base.py), django utils source codes and asgiref source codes. I found nothing special.
The obvious differences between ASGI and WSGI in my case are:
- running method:
gunicorn (WSGI) vs gunicorn with uvicorn class worker (ASGI) or daphne (same results)
- django internals:
get_wsgi_application() (WSGI) vs get_asgi_application() - process of synchronous view for ASGI mode by using ThreadPools (ASGI)
I wonder what is the cause of such a big difference, i.e., 1ms for WSGI vs 15ms for ASGI. It is hard to believe that it results from using get_asgi_application() and threading for synchronous view.
I will be thankful for any ideas or suggestions.