I have a Django API wrapped inside a Docker Instance with a nginx and gunicorn. To start the Django app, the following command is used inside the docker-compose.yml file:
gunicorn app.wsgi:application --bind 0.0.0.0:8000 --workers=1
.
On startup, I want to start a separate process that will act as a RabbitMQ QueueListener.
To do this, inside the app/urls.py
I did this:
From what I understand this guarantees that only 1 instance of the listener is started at startup and I have access to the DB and other variables inside the Django App.
Inside of this QueueListener
some logging of the messages is done inside a file.
What I want to achieve is to make log rotation based on the size of the file as well as archiving them.
The current implementation of the log rotation is done bellow.
from logging.handlers import RotatingFileHandler
def rotator(source, dest):
with open(source, 'rt') as f_in:
with gzip.open(dest, 'wt') as f_out:
f_out.writelines(f_in.readlines())
os.remove(source)
def namer(name):
return str.join('.', name.split('.')[0:2]) + "." + datetime.datetime.now().strftime("%Y-%m-%d") + "." + name.split('.')[2] + ".gz"
logger = logging.getLogger("QueueCommunicator")
logger.setLevel(logging.INFO)
handler = RotatingFileHandler(os.path.join(settings.STORAGE_DIR, 'listener.log'), backupCount=5, maxBytes=120000000)
handler.setFormatter(logging.Formatter(fmt='%(asctime)s,%(msecs)d %(process)d %(threadName)s %(name)s %(levelname)s %(message)s',
datefmt='%Y-%m-%d %H:%M:%S'))
handler.rotator = rotator
handler.namer = namer
logger.addHandler(handler)
The problem that I have is that inside the log archive (.gz) file, only the first line is written instead of the entire content of the file before the rotation. By running the project locally inside a docker instance that does not use nginx, the log rotation works and the whole content is found inside the .gz file so I am not sure exactly what could be the cause and if there is some kind of error that I am missing.
Is it possible that while the process tries to write to the file and the rotation is happening, an error might be thrown?
Do you have any idea how I could achieve the log rotation by size and archiving correctly?
Thank you!
You don’t want to do this from within your Django process. Your code does not control the process in which its being run. This is why things like Celery are started external to the main process.
This is no guarantee.
See the response and the linked messages at I'm new to WebDev, explain to me why or why it wouldn't be OK for all users to access a global Lark() parser object? - #8 by KenWhitesell to get an idea of some of the reasons why this is bad. (What’s going to happen when your process is restarted? What’s going to happen when multiple instances of the process are started?)
This is not something you need to write.
The Python RotatingFileHandler will do this for you. If you need to perform some specific processing when the files are rotated, you can implement your own rotate handler.
How would I want to approach this from an external process if I want to be able to save/update some entities that are defined inside the Django app?
Then if I require to archive some of the files that are being rotated to save space, I would need to implement my own rotate handler and I can’t use the existing RotatingFileHandler?
Thank you for the response!
Look at both Custom django-admin commands and Celery workers for examples. These are both mechanisms used to access a database using the Django ORM. Consider the Django shell
command. It opens up a stand-alone process giving you full access to the ORM, having nothing to do with a running instance of a server.
You would need to define a new class that inherits from RotatingFileHandler and overrides the rotate
method. It’s still a lot less work than trying to completely handle all aspects of file rotation yourself.
Thank you for your time and response!
I will look into the mentioned topics. Celery seemed like overkill initially and I hoped that I would not need to integrate it inside the project.
To be clear - I’m not saying you need to implement Celery. I’m highlighting it as an example of a way of setting up a task that exists as a separate process, independent of your server processes.
If the process you’re trying to run is something that should always be running, I probably wouldn’t set it up as a Celery task. I’d probably create it as a custom admin task and run it using manage.py
. (More likely, I’d probably set it up as either a system service or using supervisord if I wanted to ensure it was always available.)