Hi everyone,
I’m refactoring a legacy Django application (Django + uWSGI + PostgreSQL) handling complex multi-step questionnaires. We hit a critical stability wall that only occurs in Production, while the Local environment (runserver) works perfectly.
The Context & Environment:
-
Local: Works fine. Likely because test data is small, keeping the session cookie under the browser’s limit.
-
Production: Fails catastrophically. Real-world data volume (50+ answers) pushes the session payload over the edge.
The Symptoms:
-
Infinite Loops: Users fill Page 1, click “Next”, and the form loops back.
-
The Error: Sentry logs
Cookie "sessionid" is invalid because its size is too big. Max size is 4096 B(we usesigned_cookies). -
Worker Churn: uWSGI workers constantly restart (OOM/Timeouts) due to massive session serialization.
-
Data Integrity:
NULLvalues found inNOT NULLDB columns.
Root Cause Analysis:
-
Session Abuse: The code hydrated all answers into a dict and dumped them into
request.session. In Prod, this payload > 4KB. -
Forms.py Anti-patterns:
-
N+1 Queries:
save()looped through fields doingQuestion.objects.get()for each. -
Silent Failures: The
save()method wrapped logic intry...except: pass. If the DB rejected data (IntegrityError), it failed silently.
-
The Fix: We refactored to a stateless architecture:
-
Stop Session Hydration: Forms now fetch existing answers directly from the DB via pre-fetching.
-
Atomic Transactions: Wrapped
save()intransaction.atomic(). -
Removed Silent Fail: Replaced
passwith explicitValidationErrorraising.
Question: Has anyone else seen signed_cookies act as a “silent killer” that only manifests when data volume scales up in Production?

