Hi django fellows.
I’m wondering if there is any caveat when adding migrate command in deployment.
My company uses aws ecs for django project deployement. We uses gunicorn for running app in production, but in that process migrate command is missing. The reason is our previous DevOps engineer was super conservative about db operation in app deployment. Fair point. Since we cannot directly access to docker container in production, we instead make sql file by running sqlmigrate
locally, access to production db, and then run sql script manually.
Sounds pretty unnecessary, doesn’t it?
So I would love to put migrate
command in deployment script. Then this question hit me. ‘What if one thread still uses old code when db migration is already complete?’ If I have many servers running with gunicorn on aws ecs, wouldn’t such scenario be possible?
I wonder how other fellows doing in production with running migrate command.
Please redirect me if there’s already related discussion.
P.S
Please note that we have development and staging server. We do all the test there, so it’s highly unlikely thatmigrate
command crash.
Hello there!
Will give my 2 bits here..
We’re using AWS AppRunner, and have the migrate
command ran before running the server (on the entrypoint script). Due to the green/blue deployment strategy (that I think it’s similar to the ECS deployment strategy), we have a small window of time that HTTP requests goes to the previous version server, and that may cause some failures. For us, it’s not that big of a deal. Most of the times this is not noticeable by end-users.
On our pipeline, we also deploy celery on a EC2 after pushing the new application image to ECR, we also run the migrations before running the celery worker. From what I remember, normally the celery process is the one that executes the migrations (since it happens before the image is picked and started by AppRunner), and for us this is the one more important process to have the “migrations” synced.
The errors that @leandrodesouzadev pointed out are very real. For small scale, they can be minimal enough that it’s no big deal. For me they were bad enough that I felt I needed to address them, but I really wanted (a) for model and migration changes to merge at the same time, and (b) to be able to avoid running migrations for deployment manually wherever possible.
So I wrote django-safemigrate. It’s not the only solution to the problem, but it’s my take at a solution, and I’ve used it pretty successfully at two different companies. It might help you, too!
That’s a great approach, just starred the project, looks some of the tools I would want on starting a new project.
Just to add a bit more context, this is not a problem for us, even though we receive thousands of requests each day. That really depends on where are actually the important things are happening on your workflows. For us, most of the work happens on celery, so that’s the most important process to have it right.
That’s good context to consider.
In my cases I also work heavily with Celery, and it’s very important that I handle that rollover well. Unfortunately, some of my celery jobs can take a little while to complete (minutes or tens of minutes, rather than seconds), so the opportunity for accidentally causing disruption gets pretty high. Plus, I watch Sentry for problems, and seeing things error out for schema errors, even in celery tasks, really makes me nervous that it might be some important operation that’s being dropped on the floor.
It just depends on your application’s workload, and how tolerant you can be of errors during the deployment. For me, I understand the pattern well enough that I’ll generally implement django-safemigrate
into my deployment, and then do my best to annotate the migration correctly to avoid the errors. I’ve found the incremental burden to be quite manageable once I’ve practiced it.
It’s still a new thing to learn when you’re getting started, and I don’t want to dismiss that complexity as inconsequential. Once you’re to the place that you’d like to automate migrations, you might consider whether it’s worth some time investment to set things up for it to at least be possible to roll things out without errors. It feels like a good junction point.
1 Like
Thanks for invaluable replies.
With hindsight, our app does have slight moments when old thread goes to new codebase and cause error even when we do manual migration. And mostly those are resolved in few seconds if not a minute.
And many of requests are read operation so I can say it’s rather safe to have short mismatch. Since our app doesn’t have as many requests as millions, I’ll try putting migrate command in script before gunicorn
, but when requests spike then it’s worth considering library like django-safemigrate.
Again, thank you to both of you.
2 Likes