Detailed LogEntry for m2m field

I am trying to customize the history view in the Django Admin site to add some extra information about changes to models.
Specifically, I would like to be able to add a custom string representation of the specific changes to one or more many-to-many fields in the model, rather than just the fact that a change occurred (ex: “X, Y, Z added, A, B, C removed from m2m field Foo”).
I believe this should not be overly complicated, but I am not sure where to “apply the scalpel” in this case.
I have looked at both “django-simple-history” and “django-reversion”, but neither one seems to help me do what I want (I don’t actually need to be able to revert to a previous version anyway, just add a custom change_message).
Specifically django-simple-history does not track m2m fields, though it appears to have a possible workaround by using custom through models:


django-reversion has better built-in tracking of m2m fields, but the history log entries also only show that “something changed”, and I have to inspect the previous versions one by one to see what actually changed. Also, the fact that I to wipe out the history after any migration of the model’s fields is also not ideal.
I believe my solution lies somewhere in a combination of a receiver of the “m2m_changed” signal (particularly the “pk_set” parameter), along with a custom LogEntry, possibly overriding the “get_change_message” method, or maybe a call to “LogEntry.objects.log_action” with a custom “change_message”? I noticed that “pk_set” appears separately in “pre/post_add” and pre/post_remove" actions of the m2m_changed signal, so maybe I would have no choice but to create two separate log entries?

Any help is appreciated!

I was going to use django-simple-history but, having your same needs and seeing the problems you post maybe I have to try another thing.

I’m going to tackle this later, but I’m interested in your solution if you find one.

After a deep dive into the Django source code, a lot of brainstorming (including some help from a consultant on my project), and testing out of a couple of ideas, I have arrived at this approach which I think should solve all of my problems.

One additional requirement that I did not mention above is that I would also like to be able to store a file (in this case, an authorization email) with each model change. This requirement is what led me to choose to include django-reversion in my solution, because it is able to store previous versions of FileFields after each update.

So here is my solution:
NOTE: I am using my own generic User Model. I may eventually decide to subclass auth.User so that I can tie into the various authentication backend options.
Also, I am using django-mptt for a hierarchical location model, but that is not required (just get rid of all of the Tree stuff)

models.py:

from django.db import models
from mptt.models import MPTTModel, TreeForeignKey, TreeManyToManyField, TreeManager
import reversion

SHORT, LONG, MAX, PLACES, DIGITS = (... some default character lengths for various fields ...)

class UserManager(models.Manager):
    def get_by_natural_key(self, key):
        return self.get(username=key)

@reversion.register(
    follow=['locations', ]) # ... any other m2m foreign keys you want to track ...
    # use_natural_foreign_keys=True, use_natural_primary_keys=True)
class User(models.Model):

    username = models.CharField(max_length=LONG*2, unique=True)
    locations = TreeManyToManyField('Location', related_name='detail_users')
    # ... other stuff ...

    # NOTE: I initially tried adding a custom function for upload_to, but I think django-reversion had a hard time with it
    validation_email = models.FileField(upload_to='validation_emails', null=True)

    objects = UserManager()
    def natural_key(self):
        return self.username


class LocationManager(TreeManager):
    def get_by_natural_key(self, key):
        return self.get(location_code=key)

@reversion.register()
# use_natural_foreign_keys=True, use_natural_primary_keys=True)
class Location(MPTTModel):

    location_code = models.CharField(max_length=LONG, unique=True)
    parent = TreeForeignKey('self', on_delete=models.DO_NOTHING, null=True, related_name='children')
    description = models.CharField(max_length=LONG*2)

    # ... other stuff ...
    objects = LocationManager()
    def natural_key(self):
        return self.location_code

In admin.py, I need to create a custom ModelAdmin, and register it with django-reversion.
However, I need to make a few specific tweaks to get the behavior that I want.

In particular, this snippet of code from “_changeform_view()” inside django.contrib.admin.options.BaseModelAdmin was very informative:

if request.method == 'POST':
    form = ModelForm(request.POST, request.FILES, instance=obj)
    form_validated = form.is_valid()
    if form_validated:
        new_object = self.save_form(request, form, change=not add)
    else:
        new_object = form.instance
    formsets, inline_instances = self._create_formsets(request, new_object, change=not add)
    if all_valid(formsets) and form_validated:
        self.save_model(request, new_object, form, not add)
        self.save_related(request, form, formsets, not add)
        change_message = self.construct_change_message(request, form, formsets, add)
        if add:
            self.log_addition(request, new_object, change_message)
            return self.response_add(request, new_object)
        else:
            self.log_change(request, new_object, change_message)
            return self.response_change(request, new_object)

I thought that overriding construct_change_message() would be the best place to dig in, because the resulting change_message is automatically included in the default LogEntry class, as well as in django-reversion’s customized history page.

…However, in the above snippet I can see that save_model() is called before construct_change_message().

Unfortunately, that means that I cannot compare the data in the form to the data in the database directly within the construct_change_message function. That is because the database has already been updated in the save_model step, so both share identical data.
NOTE: Django internally keeps track of which fields are changed (“form.changed_fields”), but it does not store exactly HOW those fields changed (this is the part that I need to hack!)

To get around this problem, I decided to modify the ModelAdmin’s init() constructor to add an extra variable to cache those changes, and then store them when save_model() is called. Then I can use those cached changes to construct the change_message in the construct_change_message() function.
admin.py:

from django.contrib import admin
from django import forms
from <MyApp> import models
from mptt.admin import MPTTModelAdmin
from mptt.forms import TreeNodeMultipleChoiceField
from django.core.exceptions import ValidationError
from reversion.admin import VersionAdmin


@admin.register(models.User)
class UserAdmin(VersionAdmin, admin.ModelAdmin):

    list_display = ('username',)
    search_fields = ('username',)
    ordering = ('username',)
    
    # use my own custom form
    form = UserPermissionsForm

    # override init to store an extra dictionary in the UserAdmin object.
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)

        self.changes = {}

    # override save_model to store the changes
    def save_model(self, request, obj, form, change):

        # NOTE: I only need to store changes when the model is changed,
        #             not when a new instance is created.
        if change:
            self.cache_changes(request, obj, form)
        
        super().save_model(request, obj, form, change)

    # My own custom function to keep track of the changes to the User model
    def cache_changes(self, request, obj, form):

        # Get the new (unsaved) model instance from the form (you can use "obj" here instead if you like)
        new_model = form.instance

        # Fetch the old version of the model from the database
        # NOTE: Some solutions I have seen store a cache of previous versions of a model
        # in the model's __init__() constructor, but I don't mind a few extra database hits here
        # because these types of queries will only be run by staff, and executed ~1/week
        # ... premature optimization and all of that ...
        old_model = type(new_model).objects.get(pk=new_model.pk)

        changes = {}
        text_fields = ['username', ] # include any other direct fields on the User model (not foreign keys)
        for field in text_fields:
            change_str = ''
            if field in form.changed_data:
                old_value = getattr(old_model, field)
                new_value = getattr(new_model, field)
                change_str = f'Changed from {old_value} to {new_value}'
            changes[field] = change_str
            
        m2m_fields = ['locations', ] # include any other m2m foreign keys
        for field in m2m_fields:
            change_str = ''
            if field in form.changed_data:

                # In this case, I am happy to just use the string representation for each field
                # You could also use the natural_key, or write your own alternate repr() function in your model
                new_values = [str(v) for v in form.cleaned_data[field]]
                old_values = [str(v) for v in getattr(old_model, field).all()]

                # Whatever you chose above, you should make sure 
                # that it is unique for each instance of your model though!
                added_values   = sorted(list(set(old_values) - set(new_values)))
                removed_values = sorted(list(set(new_values) - set(old_values)))

                change_str += (', '.join(added_values)   if added_values   else 'Nothing') + ' added.\n'
                change_str += (', '.join(removed_values) if removed_values else 'Nothing') + ' removed.\n'

            changes[field] = change_str
        
        changes['Comment'] = form.cleaned_data['change_reason']        
        
        # cache the changes so they can be used to generate an appropriate change_message
        self.changes = changes

        # NOTE: In a previous iteration, I just created my own UserChange model to store these messages
        # However, that presented its own challenges
        # when attempting to integrate it into the history view for a given User model...
        # models.UserChange(**changes).save()

    # Now I can override the ModelAdmin's construct_change_message()
    # to inject my own change_message (only for changed models, not when adding new one's)
    # Also, I have skipped formsets, because I don't fully understand how they work yet
    # and I am not using any here...
    def construct_change_message(self, request, form, formsets, add=False):
        if not add and not formsets:
            return '\n'.join([f'{key}: {value}' for key, value in self.changes.items() if value])
        else:
            return super().construct_change_message(request, form, formsets, add)

… So that’s it! I hope that can be helpful to some other people too!

The main advantage of intercepting the change_message is that I can build my solution directly into the standard LogEntry machinery so I don’t have to create my own History view.
The disadvantage is that I only get one string to describe the whole change, instead of breaking those changes up into multiple columns for each field. However, it seems that Django has no trouble rendering newline characters, so I can format the message nicely enough for my purpose anyway.