DEP 0009: Async-capable Django (Discussion about async connection for #35629)
sources:
- original mr - Fixed #35629 -- Added support for async database connections and cursors. by fcurella · Pull Request #18408 · django/django · GitHub
- experimental mr - represented alternative for AsyncConnectionHandler by Arfey · Pull Request #1 · Arfey/django · GitHub
Hi there During a code review, we’ve faced a pool of problematic topics which I think we need to discuss before going forward.
topics:
- Global connection
- Different connection classes
Global connection
https://github.com/django/deps/blob/main/accepted/0009-async.rst
In DEP 0009, a global connection is described as something we need to eliminate in the future and replace with an explicit new_connections
context manager.
async def get_authors(pattern):
async with db.new_connections():
return [
author.name
async for author in Authors.objects.all()
]
Each new_connections
generates a new independent database connection. In the current implementation, this is the single way to create a new connection.
it solves problem:
- Add the ability to create an independent connection and, as a result, execute queries in parallel
- Add a pretty approach to creating and closing a connection (without call
close_old_connections
)
The main problem for me in this feature is that this is not an optional ability for performance optimisation, it’s a necessity that u have to use everywhere.
As a result, I don’t understand how to solve problems that are represented below
- How to use it during the request/response cycle?
Right now, if u need to fetch a user inside middleware, Django creates a new connection and after reusing it inside a view and logic functions. If we don’t need to fetch the user, then we create it inside the view. If we don’t need a connection inside a view, we will not create a connection at all.
With the current solution we can generate a separate connection for each place when we need it but this approach produce a lot of connection’s calls which obviously are not for free. Also, we can generate a middleware
async def db(get_response):
async def middleware(request):
async with db.new_connections():
response = await get_response(request)
return response
return middleware
But we will create a new connection for each request (with the current implementation) even if we don’t require it. Also, we will have a performance penalty for the async api with this middleware due to async_to_sync
for each request event if we don’t need an async connection here.
- How to write utility functions?
I have a function to create a notification.
async def create_notification(user_id, msg):
async with db.new_connections():
await Notification.objects.create(user_id=user_id, msg=msg)
I use it everywhere. And at some moment, I got requirements to send a notification during the user creation process.
async def create_user(name):
async with transaction.async_atomic():
user = await User.objects.create(name=name)
As u can see I need to have the same connection here and as result i need to remove new_connections
from create_notification
, and wrap this function everywhere with new_connections
.
async def create_notification(user_id, msg):
await Notification.objects.create(user_id=user_id, msg=msg)
async def create_user(name):
async with transaction.async_atomic():
user = await User.objects.create(name=name)
await create_notification(user.id, 'hello!')
If I need to use create_user
inside another transaction, I need to do it over and over
Solution:
Keep the global connection variable with lazy initialization for async connection.
# global
async_connections = AsyncConnectionHandler()
async def foo():
connection = async_connections[DEFAULT_DB_ALIAS]
async with connection.cursor() as cursor:
res = await cursor.execute("select 1")
return await res.fetchone()
It works well and doesn’t have problems described above (You can check it in my experimental MR). + signal handler with close_old_connections
. I think, we will cover 90% use cases with this approach. For the rest of them, we can implement a context manager (but as an optional ability).
async def foo():
async with async_connections[DEFAULT_DB_ALIAS] as connection:
async with connection.cursor() as cursor:
res = await cursor.execute("select 1")
return await res.fetchone()
We can add __enter__/__exit__
for AsyncConnectionHandler
and ConnectionHandler
and remove new_connections
at all (but it is just a detail)
What do u think?
Different connection classes
In the current implementation, async and sync connections are different and don’t have any relations (if u create a transaction inside async, u can’t see these changes inside a sync connection). But right now we have the same class, and if I create an async connection, I have access to the sync method (yes, with errors, but I have ).
As a result, we have a lot of a (aclose, aexecute, etc). I don’t see any sense in doing it. I prefer to follow the Python database API (PEP 249 – Python Database API Specification v2.0 | peps.python.org)
Maybe someone has an understanding of why this makes sense? (I tested different classes, and u can check it in experimental MR)