Intermittent PostgreSQL connection errors

Our webserver (Django-app on Gunicorn running on Google Cloud Run) connects to a Postgres 15 database (on Google Cloud SQL) through Psycopg. Most queries are successful, but recently ~1% of the queries fail on random moments, with random error messages like:

  • connection failed: region:db_name/.s.PGSQL.5432" failed: server closed the connection unexpectedly - This probably means the server terminated abnormally before or while processing the request.
  • got message type “P”, length 1380524074
  • connection failed: region:db_name/.s.PGSQL.5432" failed: FATAL: password authentication failed for user “postgres”
  • consuming input failed: server closed the connection unexpectedly - This probably means the server terminated abnormally before or while processing the request.
  • invalid socket

Sometimes we see an error at the server side at the same moment, for example:

  • FATAL: canceling authentication due to timeout
  • FATAL: connection to client lost
  • FATAL: password authentication failed for user

The password authentication failed error puzzles me: we’re always connecting with the same password.

The got message type "P" message looks cryptic to me, and the length mentioned (over 1G!) is abnormal, I don’t see why such a long message is being sent.

In the Django settings file, I tried different settings:

  • CONN_HEALTH_CHECKS = True or False
  • CONN_MAX_AGE = 0 (new connection for every request) or None (unlimited persistent connections)

Resources (CPU, memory, disk space, …) are well below the limits.

PostgreSQL is at version 15, Django at version 4.2.9 and psycopg at version 3.1.17.

We tried reverting to psycopg 3.1.14 and Django to version 4.2.8 since we’ve been running several weeks without problems before, but the connection issues are still present.

Does anyone have any ideas on how I can investigate this problem?