Receiving a lot of files in requests that I must store on file storage, I would like to either offload the file writing or use aiofiles so I do not block threads unnecessarily long.
What would be anyone’s preferred approach for that? Make a custom storage provider that would use aiofiles and if someone already done this, any hint for me? Or throw it to a queue (more IO…)? Or?
Serious question - have you actually identified this as a problem, or is this as much conjecture as anything else? Does this really happen so often and in such quantity that you can’t spare one process for that period of time?
Without knowing more of the specifics of the situation, such as the number, size, and frequency of these uploads along with the characteristics of the file storage environment, it’s going to be tough to make any specific recommendations.
If you’re writing a custom upload handler to write each block directly to the file storage, there’s nothing to be gained by trying to spawn that off, because your main thread still needs to handle each chunk of data as it comes in.
If there really is that big of a difference in throughput between the rate at which the data is presented to Django and the rate at which it can be written to your storage, you could, for example, create a celery task and feed the data through that task, where the celery task is responsible for writing the data. (Or, depending upon the amount of data being uploaded, you could use a Redis queue or even just local storage as a temporary location.)
But unless this really is an extreme situation, I’m not sure you’re not trying to solve a problem that doesn’t really exist.
Thanks for your response. Seriously… it is guesswork and you do have a very valid point. And the celery task is an option I was thinking about, but right now I did not have a need for an async worker just yet.
Nevertheless, I would like to elaborate as indeed, the question is a bit too generic. I am looking at a particular django package : django-pyas2. It is a file transfer server for AS2 message. The post HTTP request have MIME content which are then unpacked, decrypted, signatures are verified, then the payload is saved (this is where the file IO is happening) and as a response a signed delivery notification is sent as the response. In the peaks i see about 300 requests per minute, each having a file attached in different sizes, the majority of these files are small (<10k). The response time during that time spikes. One of my assumptions are blocking IO threads being a cause. The measure i have taken on Azure so far look like this and I was thinking to maybe try something and see if that might help. Right now it is all sync and not on bare metal but in a container in a K8S cluster and of course the file path is again attached storage which has some overlay filesystem as well. So, yes, throwing hardware at it would probably solve the problem, but maybe there is a better way.
I have not looked at custom upload handlers…
How are you running the django process? Gunicorn? uwsgi? (something else? Please don’t say runserver.) One solution may simply be to increase the number of worker processes.
It may also be worth trying to look at more granular statistics regarding your response times to determine whether these increases are affecting non-upload requests. (It’s possible that an upload that takes 20 seconds is pulling the overall average up and that the uploads themselves are not affecting non-upload requests.)
This would be pushing the envelope… — totally possible but not trodden ground. (If you do do it, a blog post about what you discover would be cool.)
TBH you should be able to scale well beyond the 99% case without needing to reinvent the wheel here. (Ken’s answers are IMO)
@KenWhitesell - very good point about the slow upload!
I run the processes with gunicorn, the gevent worker class has so far yielded the best results for me in my environment.
command=gunicorn --keep-alive=10 --graceful-timeout=100 --timeout=60 --log-file=- --workers=3 --worker-connections=500 --max-requests=10000 --max-requests-jitter=500 --worker-class=gevent --worker-tmp-dir /dev/shm --forwarded-allow-ips=“*” --capture-output --log-level INFO --bind :80 as2.wsgi:application
DB is MySQL, but that is rather under-utilized. Using the GitHub - axiros/gevent_mysqldb: MySQL database connector for Python (with Python 3 support) driver has actually dropped the response times during the peaks by 60%, it regularly was above 60 seconds, which probably let me to the assumption that the delay is on my side.
@carltongibson : thanks for reconfirming Ken’s view! But the way it looks, the blog post seems miles away.
I will have to do the homework and do proper load testing and measurement of the requests and the file io’s to remove uncertainties. Meanwhile scale resources I guess.
Thank you both for your views! Highly appreciated.