database_sync_to_async calls causing asyncio.CanceledError

I’m facing an odd behavior on one of my applications. So I have an application that has an POST entrypoint that receives a list of order id’s and after processing each order it must inform an external API about the monetary value of each order. So what we did was the following:

order_details_tasks = [
        asyncio.create_task(
            self.launch_individual_order_details(order),
            name=f"task_order_{order.id}",
        )
        for order in active_orders
    ]
    results = await asyncio.gather(*order_details_tasks, return_exceptions=True)
    for task, result in zip(order_details_tasks, results):
        if isinstance(result, Exception):
            print(f"⚠️ Task '{task.get_name()}' raised an exception: {result}")
        else:
            print(f"✅ Task '{task.get_name()}' succeeded with result: {result}")

The launch_individual_order_details(order) function does the following stuff and other non I/O logic before this block:

logger.debug(f"Sending order details with success for order: {order.id}")    
await order_service.send_order_request(order)

Inside send_order_request we create a entry on a table called Transaction with the order id and the corresponding order amount and in pending state and send http request using aiohttp.CLient library. Afterwards we update the transaction status to Success if the request response is succesful and to error if the request fails.

So the problem we are facing is that when our system is with a relative amount of load in our pods when the pod uses 70% of the CPU limits we give to them we notice that some of our tasks simply break execution and don’t inform the event loop about any possible error.

After even further investigation I encapsulated the coroutine where the request is made with a try/except block and printed the traceback in case a asyncio.CancelledError is raised and noticed that in fact those tasks are being canceled when they interact with my MySQL databases like the following:

During handling of the above exception, another exception occurred:
   Traceback (most recent call last):   File  line 100, in launch_individual_order_details
     await database_sync_to_async(self.update_transaction_status)(
            details_transaction, Transaction.SUCCESS
        )

     File "/opt/venv/lib/python3.13/site-packages/asgiref/sync.py", line 485, in __call__
     ret = await exec_coro
^^^^^^^^^^^^^^^
asyncio.exceptions.CancelledError
{}

If instead of running this tasks in the event loop in run them through celery everything works as expected an they don’t fail so I don’t know what is happening but I suspect that the event loop may be canceling some of these tasks due to limits on the database access. Can someone give me some hints regarding this topic? If I run 100 tasks by minute in a 30 minutes heady load testing I would say that 35 in total fail.

Is this your production environment, or your development environment?

If it’s your production environment, are you able to replicate this in your testing / development environment?

Have you tracked the number of database connections to see if that’s an issue? (Perhaps checking the database logs?)

If you’re not certain about this, have you tried increasing the number of allowable connections?

Hello there @KenWhitesell ! it happened on my testing environment in one of my GKE pods. I didn’t do that but I will try do it now. thank you for the suggestion

Well @KenWhitesell I queried my database during the load test I just did and the max number of Threads_connected that I found was 970 when my max_connections variable is set to 4030 so it may mean that the case may be related with other situations but it’s just odd to me that these tasks that are canceled are always canceled when each is executing a database query and the ones that are canceled are always canceled when they are executing the same query given it to be updating a specific transaction or even when creating a transaction.