Peak database connections

I deployed my Django application on GCP with Cloud Run and Cloud SQL and I’m spending a lot of money on Cloud SQL, because I have to hire 2 vCPUs because of the connection limit that has to be greater than 100, but it’s not even using 10% of the memory and CPU. I tried to configure Django’s default connection pool, but I still have a peak of 149 connections per day.

DATABASES['default']['OPTIONS'] = {
    'pool': {
        'min_size': 5,
        'max_size': 20
    }
}

Cloud Run introduces quite a few variables that make this a challenge to debug without knowing more about your setup.

Do you have a value other than 0 set for CONN_MAX_AGE? If so, it’s possible that Cloud Run is killing your instances before Django cleans up these connections, so they’re staying open (potentially indefinitely). Generally, it’s safer to run with a CONN_MAX_AGE of 0 on Cloud Run.

Alternatively, it could be that you’re receiving that many requests all at once and Cloud Run is configured with a low ‘Maximum concurrent requests per instance’, so each request is creating a new Cloud Run instance, which doesn’t use your pool/existing sessions.

One approach you can use to figure out what’s going on is to run SELECT * FROM pg_stat_activity; on your database instance and check for ‘idle’ connections. If these were all created a while ago, it’s likely the first issue. If they all have similar timestamp, then it could be the second issue.

My configuration in the Django settings database is just that, so yes, the CONN_MAX_AGE variable is already set to 0 by default.

I ran the query you indicated and this was the result:


I have a value of 80 in Cloud run for ‘Maximum concurrent requests per instance’.

Great OK, another couple of things to check then:

Cloud SQL should have an ‘Active Connections’ graph. Does this spike at certain points or stay steady?

Your Cloud Run Metrics has a ’ Container instance count’ graph - what does this peak at? Similarly, if you’re OK sharing it, what does your ‘Request count’ graph look like?

“Active Connections” graph:

“Container instance count” graph

“Request Count” graph:

“Maximum concurrent requests” graph

Dashboard DB 7 days ago:

Could it be the case that I remove the connection pool configuration from my Django? Is it really necessary in this case?

Your pool probably isn’t helping much in this case - as configured it will try to establish and hold open 5 connections to your database per worker. This means when your instance count spikes to ~20, you also see your connections spike.

This pool is only really useful if you intend to keep the same instance around for a while and intend to serve many requests from it (by setting ‘minimum instances’ in Cloud Run for example).

Looking at your request count and instance count graphs - it seems like Cloud Run is starting new instances for many of your requests, so it could be that even though a single instance is configured to handle 80 requests, it’s hitting the 60% CPU usage target and scaling anyway.

You could consider:

  • Increasing the size and workers of the Cloud Run instances created so that a single instance can handle more traffic and therefore more efficiently use the pool.
  • Moving pool management outside of the application/Cloud Run by running pgbouncer somewhere else, e.g. in Compute Engine, or by using a database provider that supports pgbouncer (like Crunchy Bridge.
1 Like

Thank you very much for your help!

I will first try to use pgbouncer to manage the connection pool through Compute Engine.

1 Like

I’ve been trying all day to implement pgbouncer, but GCP makes it very difficult, without success. I think I’ll choose to increase the resources of my Cloud Run containers. But I found this Google Cloud situation to be somewhat problematic.