I’m using an nginx, uWSGI, django, postgresql stack deployed on Ubuntu 20.04. I’m seeing random (in time and code location) errors because the database connection is unexpectedly closed during the handling of a request. I think I’ve figured out why, but it’s so weird I don’t know who’s ‘wrong’. Here’s what I’ve found out…
In my DB settings I have CONN_MAX_AGE=None
, so in routine operation connections stay alive forever. There are no errors in any logs (uWSGI, Postgresql, my internal logging) - there was no reported reason why the DB connection was being closed. So I hacked the Django source to print a stack trace in the BaseDatabaseWrapper._close()
method, using traceback.print_stack()
.
This trace makes no sense - here’s an example:
...
File "/deployment/v_env/lib/python3.8/site-packages/django/db/models/sql/compiler.py", line 1208, in execute_sql
return list(result)
File "/deployment/v_env/lib/python3.8/site-packages/django/db/models/sql/compiler.py", line 1646, in cursor_iter
for rows in iter((lambda: cursor.fetchmany(itersize)), sentinel):
File "/deployment/v_env/lib/python3.8/site-packages/django/db/models/sql/compiler.py", line 1646, in <lambda>
for rows in iter((lambda: cursor.fetchmany(itersize)), sentinel):
File "/deployment/v_env/lib/python3.8/site-packages/django/http/response.py", line 292, in close
signals.request_finished.send(sender=self._handler_class)
File "/deployment/v_env/lib/python3.8/site-packages/django/dispatch/dispatcher.py", line 180, in send
return [
File "/deployment/v_env/lib/python3.8/site-packages/django/dispatch/dispatcher.py", line 181, in <listcomp>
(receiver, receiver(signal=self, sender=sender, **named))
File "/deployment/v_env/lib/python3.8/site-packages/django/db/__init__.py", line 38, in close_old_connections
conn.close_if_unusable_or_obsolete()
...
In the middle of Django building a query, we have a call to HttpResponseBase.close()
. This makes no sense - there’s no way that the caller shown in the stack made this call. I have many other examples with different upper call stacks - in all cases the call to close()
appears to be injected from nowhere.
Now this method is called by uWSGI from the C runtime…so I’m thinking, is this normal? If you have a separate C thread call Python at some random time, does it simply inject into the call stack in this way? I’ve not been able to find out if that’s the case.
Let’s assume this is what is happening and that it’s actually okay - a big assumption but I’ve looked at the uWSGI source and it certainly takes lots of precautions to acquire the GIL before interacting with Python so it looks okay.
So maybe the HTTP client is closing the connection early, uWSGI closes the response, and we end up with that stack trace I’m seeing. So why is Django closing the DB connection if this is all correct?
Well, it’s doing that because Django’s end-of-request handling will close the DB connection if required or it’s in a mess. A DB transaction was in effect before the HttpResponseBase.close()
call hijacked the call stack, and the BaseDatabaseWrapper.close_if_unusable_or_obsolete
method has this line:
# If the application didn't restore the original autocommit setting,
# don't take chances, drop the connection.
if self.get_autocommit() != self.settings_dict['AUTOCOMMIT']:
self.close()
Autocommit is set to False during a transaction, but the setting is True. So that’s why the DB connection is closed. When this injected close()
call unwinds and the interrupted call stack continues, code accessing the DB finds the connection unexpectedly closed, and I get django.db.utils.InterfaceError: cursor already closed
exceptions in my application code.
So if the call to HttpResponseBase.close()
jumping into the call stack is legit, this code is wrong, and we have a bug in Django. If not, I guess there’s a bug in uWSGI.
Any insight greatly appreciated.