Deploy Django with Cloud Run and Cloud SQL

I deployed a website this past weekend using Cloud Run and Cloud SQL services from GCP. However, when handling a high number of simultaneous requests (around 5,000), my site simply crashed. I tried increasing the scalability of Cloud Run instances to 5 and raising the simultaneous connection limit to 1,000, but I kept facing database connection errors and slow performance.

Does anyone have any tips or configurations for handling websites with significant user demand in this scenario?

Focus on connection pooling in your application code - this is likely the main bottleneck. When Cloud Run scales up, it can overwhelm your database with too many connections. Implement a proper connection pool that maintains persistent connections and reuses them across requests. Then increase your Cloud SQL instance size rather than just adding more Cloud Run instances. For most web applications, a medium-tier Cloud SQL instance with proper connection pooling can handle thousands of simultaneous users.

Additionally, add caching for frequently accessed data (using Redis or Memcached) and ensure you’re not running expensive queries on each request. These two changes - proper connection pooling and strategic caching - will resolve most high-traffic scaling issues without requiring complex architecture changes.

The version of Python I am using already supports connection pooling. Would that be enough? I configured it this way:

DATABASES = {'default': env.db()}

DATABASES['default']['OPTIONS'] = {
    'pool': True
}

Moreover, I hadn’t increased my Cloud SQL instance because the graphs didn’t show any bottlenecks. However, I stress-tested it with Locust, simulating 5000 users, and after enabling the pool, the application didn’t crash—at least, it only became slow for a period of time before stabilizing. This could be due to Cloud Run scaling the instances.

5000 requests per Django instance is quite a lot. I’d recommend allowing fewer concurrent requests to a single instance, and scaling horizontally instead (more instances).

Connection pooling is particularly useful if you have a large number of threads per process (or in an async context).

Some more information of what a “crash” is would help diagnosing this. Do you get a helpful exception, or is it just a slowdown? Do the database logs show anything?

I just ran another test, making 5000 requests again. I’m trying to stress my application as much as possible since the last launch was a disaster… We had 5000 simultaneous accesses on the platform, and it completely crashed haha.

But from the test, I noticed a significant improvement. I still encountered many errors, which seem to be due to the instances not being able to scale enough to meet the demand. But now I’m wondering—isn’t Cloud Run designed for high-demand applications?

I configured my Cloud Run as follows:

  • Memory: 4 GiB
  • vCPUs: 2
  • Maximum simultaneous requests per instance: 80
  • Revision scaling: 0 - 5

I couldn’t allocate more resources because my GCP quota limits me… And from what I’ve seen, Google Cloud doesn’t allow me to request an increase since it says I’m not utilizing all the available resources, so I don’t need it. Now I’m stuck trying to figure out where the bottleneck in my application is.

Just to clarify, I’m using a Postgres database on Cloud SQL with the following specs:

  • vCPUs: 2
  • Memory: 8 GB

Errors Logs database:

Throwing more resources at the problem isn’t always the best solution. 4GB and 2 cores is quite a lot of compute.

If your app is async and doing DB queries, that can cause lots of DB connections, and timeouts with the connection pool. It sounds like pool contention is your main issue, not necessarily resources.