I am implementing a Django server that requires some tasks to run in the background. To be more specific, I get API calls from my frontend that contain a large image in chunks. Then I reassemble those chunks and I would like to run some tasks on the image in the background. These tasks include using algorithms for checking the quality of the image, as well as extracting some metadata.
I was going to use Celery for this, and to be honest the main reason was that it was the only task queue solution I knew. So I decided to do a bit of research and found out about other solutions such as RabbitMQ and Kafka. I am not familiar with them, but they seem to be more popular than Celery. Also in the future I would like to integrate caching for my web app using Redis so easy integration with Redis is something that I am looking for.
If anyone has worked with these tools before and has some feedback(issues, concerns, tips) on them to share, I would be really thankful.
It’s not a choice between Celery and RabbitMQ or Celery and Redis.
Celery can be thought of as your task manager / process dispatcher. Celery uses either RabbitMQ or Redis to manage work.
RabbitMQ and Redis are the storage mechanism / communications channel / pipeline for communication between your main application and your background tasks.
You could write a background task handler directly with the Redis (or RabbitMQ) APIs - but then you’re duplicating the work already being performed by Celery.
We use celery a lot for what we do. We used to primarily use RabbitMQ as the storage facility, but have started migrating to Redis for some work.
Both are good - our change is not due to one being “better” than the other - it’s an issue of what will work best for what we are doing.
They are different, and provide non-overlapping feature sets - you might want to understand the differences between them and make a decision based upon your own requirements.
Thanks Ken for your answer. It answers a lot of my questions. Is Celerybeat kind of the same thing as Redis and RabbitMQ? Or is it a part of Celery that just starts the tasks?
One more question regarding Redis. If I want to cache some of the results of api calls, wouldn’t Redis be a good solution for that? I think of Redis as sitting between the backend and frontend as a cache
Celery beat (if this is what you’re referring to) is just a scheduler for setting up tasks to run on a scheduled basis.
Celery runs the tasks. The tasks generally are created either by your application making a call to celery, or beats starting a task at the right time.
Yes, Redis can be used as a type of cache for this. (So can RabbitMQ, memcached, and your database for that matter.) Kinda depends upon what you’re looking to do with the data.
Ok. Everything makes sense now. Thank you