DB connection closed by uWSGI in the middle of handling a request?!

I’m using an nginx, uWSGI, django, postgresql stack deployed on Ubuntu 20.04. I’m seeing random (in time and code location) errors because the database connection is unexpectedly closed during the handling of a request. I think I’ve figured out why, but it’s so weird I don’t know who’s ‘wrong’. Here’s what I’ve found out…

In my DB settings I have CONN_MAX_AGE=None, so in routine operation connections stay alive forever. There are no errors in any logs (uWSGI, Postgresql, my internal logging) - there was no reported reason why the DB connection was being closed. So I hacked the Django source to print a stack trace in the BaseDatabaseWrapper._close() method, using traceback.print_stack().

This trace makes no sense - here’s an example:

...
File "/deployment/v_env/lib/python3.8/site-packages/django/db/models/sql/compiler.py", line 1208, in execute_sql
  return list(result)
File "/deployment/v_env/lib/python3.8/site-packages/django/db/models/sql/compiler.py", line 1646, in cursor_iter
  for rows in iter((lambda: cursor.fetchmany(itersize)), sentinel):
File "/deployment/v_env/lib/python3.8/site-packages/django/db/models/sql/compiler.py", line 1646, in <lambda>
  for rows in iter((lambda: cursor.fetchmany(itersize)), sentinel):
File "/deployment/v_env/lib/python3.8/site-packages/django/http/response.py", line 292, in close
  signals.request_finished.send(sender=self._handler_class)
File "/deployment/v_env/lib/python3.8/site-packages/django/dispatch/dispatcher.py", line 180, in send
  return [
File "/deployment/v_env/lib/python3.8/site-packages/django/dispatch/dispatcher.py", line 181, in <listcomp>
  (receiver, receiver(signal=self, sender=sender, **named))
File "/deployment/v_env/lib/python3.8/site-packages/django/db/__init__.py", line 38, in close_old_connections
  conn.close_if_unusable_or_obsolete()
...

In the middle of Django building a query, we have a call to HttpResponseBase.close(). This makes no sense - there’s no way that the caller shown in the stack made this call. I have many other examples with different upper call stacks - in all cases the call to close() appears to be injected from nowhere.

Now this method is called by uWSGI from the C runtime…so I’m thinking, is this normal? If you have a separate C thread call Python at some random time, does it simply inject into the call stack in this way? I’ve not been able to find out if that’s the case.

Let’s assume this is what is happening and that it’s actually okay - a big assumption but I’ve looked at the uWSGI source and it certainly takes lots of precautions to acquire the GIL before interacting with Python so it looks okay.

So maybe the HTTP client is closing the connection early, uWSGI closes the response, and we end up with that stack trace I’m seeing. So why is Django closing the DB connection if this is all correct?

Well, it’s doing that because Django’s end-of-request handling will close the DB connection if required or it’s in a mess. A DB transaction was in effect before the HttpResponseBase.close() call hijacked the call stack, and the BaseDatabaseWrapper.close_if_unusable_or_obsolete method has this line:

# If the application didn't restore the original autocommit setting,
# don't take chances, drop the connection.
if self.get_autocommit() != self.settings_dict['AUTOCOMMIT']:
   self.close()

Autocommit is set to False during a transaction, but the setting is True. So that’s why the DB connection is closed. When this injected close() call unwinds and the interrupted call stack continues, code accessing the DB finds the connection unexpectedly closed, and I get django.db.utils.InterfaceError: cursor already closed exceptions in my application code.

So if the call to HttpResponseBase.close() jumping into the call stack is legit, this code is wrong, and we have a bug in Django. If not, I guess there’s a bug in uWSGI.

Any insight greatly appreciated.

Thinking about it, I think this call stack injection must be the way calling from C to python works. How else could it work? The GIL could be obtained anytime, and Python isn’t going to start a new interpreter per call, so it’s got to just run in the current context.

Also I don’t think this is a bug in Django - the client has disconnected so there’s no way to write anything back - pulling the DB plug immediately makes some sense.

So how to save other folks spending ages trying to figure out what’s going wrong when this happens? Some documentation would be nice. I’ll raise a request.

Except, the client isn’t connected to uwsgi - they’re connected to nginx. And nginx doesn’t disconnect from uwsgi based upon client action - at least, not that I’ve ever seen evidence of. (I don’t know if there’s a difference with this among the different ways for nginx to connect to uwsgi - e.g. socket vs tcp, or uwsgi vs proxy.)

Now, if there’s some degree of unreliability between nginx and uwsgi, I guess this makes sense.

But my general expectation would be that nginx would be serving as an isolation layer between the client and uwsgi.

(Admitttedly, this is all mostly conjecture on my part - not direct knowledge.)

I’d have to say that based on my direct experience nginx is passing on the disconnect - that’s the only explanation of what I’m seeing.

@KenWhitesell Perhaps this section of my nginx config explains this:

  location / {
    include uwsgi_params;
    uwsgi_param UWSGI_SCHEME $scheme;
    uwsgi_pass unix:///tmp/app-uwsgi.sock;
    uwsgi_buffering off;
  }

I explicitly turn off buffering as I sometimes use streaming responses.

Just to finish this thread off…

I turned uwsgi_buffering back on in my nginx config but it made no difference.

I’ve deployed lots of django apps in the past but this is the first time this has been observed - I think there’s something a little unusual about how the client environment closes connections (a GameMaker app deployed in a custom Android environment).

Anyways…I ended up adding a handler for Django’s request_finished signal and used that to set a “request finished” flag in my code - if my request handler raises an unhandled exception and that flag is set, I know that the HTTP connection was terminated before the request handler finished, and I just ignore whatever exception was thrown.