Model.clean after errors on Model.full_clean

Hey there!
I have a situation that i want to share here, and maybe get some additional toughts on it.
The models involved:

from django.db import models

class PointRule(models.Model):
    name = models.CharField(
        verbose_name=_("Rule name"),
        help_text=_("This name is going to be used only for internal reference"),
        max_length=128,
    )
    starts_at = models.DateField(
        verbose_name=_("Starts at"),
        help_text=_("The first day (inclusive) that this rule will be on effect"),
    )
    ends_at = models.DateField(
        verbose_name=_("Ends at"),
        help_text=_("The last day (inclusive) that this rule will be on effect"),
    )
    # Other fields

    def clean(self):
         if self.starts_at > self.ends_at:
            raise ValidationError(
                {"ends_at": _("The ends at date must be greater or equal than the starts at")}
            )

Pretty straight forward validation here, i want to assert that the start date is greater thant the end date. Assuming that this fields are required (non-blank, non-nullable), from my point of view my clean method shouldnt have to worry about starts_at or ends_at being None or any other non-date value.
That’s not the case though, if i go the shell and pass this scenario a TypeError is raised because one or both values are None:

from some_app.models import PointRule

r = PointRule(name="foo", starts_at=None, ends_at=None)
r.full_clean()
TypeError: '>' not supported between instances of 'NoneType' and 'NoneType'

I wonder if there’s any benefits of adding a mechanism to prevent that the clean method is called when we know we have errors on fields already.
That would be a good case on this example, on the full_clean method, django called clean_fields and we had some errors populated from both of the fields (starts_at and ends_at). So my clean method wouldn’t need to be worried about it.
Of course, right now this can be done on the clean method to prevent the TypeError for being raised, but it looks i bit wacky:

# PointRule model
def clean(self):
    if self.starts_at is None or self.ends_at is None:
        return
    # Validation continues, asserted that the values aren't None

This works great for a few fields, but when the clean method contains more fields to validate it makes not really a great experience.

Of course one can go and create a “BaseModel”, override the full_clean method to prevent that the clean method is not called when there’s any errors on the validation, looking like:

# I removed some comments to better describe where the changes can be done
from django.db import models
class BaseModel(models.Model):
    def full_clean(self, exclude=None, validate_unique=True, validate_constraints=True):
        errors = {}
        if exclude is None:
            exclude = set()
        else:
            exclude = set(exclude)

        try:
            self.clean_fields(exclude=exclude)
        except ValidationError as e:
            errors = e.update_error_dict(errors)

        try:
            ################################
            if not errors:
                self.clean()
            ###############################
        except ValidationError as e:

Right, now if we inherit from the BaseModel, we can assert that the clean method would only be called when there’s no errors, and in fact a ValidationError is raised, and we can remove that if statement we added:


class PointRule(BaseModel): # <<<< Inherit the base model
    name = models.CharField(
        verbose_name=_("Rule name"),
        help_text=_("This name is going to be used only for internal reference"),
        max_length=128,
    )
    starts_at = models.DateField(
        verbose_name=_("Starts at"),
        help_text=_("The first day (inclusive) that this rule will be on effect"),
    )
    ends_at = models.DateField(
        verbose_name=_("Ends at"),
        help_text=_("The last day (inclusive) that this rule will be on effect"),
    )
    # Other fields

    def clean(self):
         # Our clean method doesnt need to worry about bad data
         if self.starts_at > self.ends_at:
            raise ValidationError(
                {"ends_at": _("The ends at date must be greater or equal than the starts at")}
            )

Then on the shell again:

from some_app.models import PointRule

r = PointRule(name="foo", starts_at=None, ends_at=None)
r.full_clean()
django.core.exceptions.ValidationError: {'starts_at': ['This field cannot be null.'], 'ends_at': ['This field cannot be null.']}

Looks great! Sadly it doesn’t work with ModelForms (and the Admin), because on the ModelForm validation the fields would be validated on the form, and if there’s any errors already on that fields, the full_clean method would receive a exclude parameter, and consequently the errors variable inside the full_clean method would have no errors and we go back to the same. (Of course we could check the exclude parameter on full_clean and not call clean if there’s any fields excluded, but then we come to a point of ambiguity: Is there any errors or should a field just be ignored?)
I wonder if there’s any point on adding a new parameter to the full_clean method: “errors” for instance. This errors could receive the errors from the ModelForm that were validating.
Are your toughts on this?
Cheers

1 Like

There’s another option you might consider - you could override the full_clean method on individual models to exclude fields that are being checked in clean so that they aren’t checked in clean_fields.

Hmm, but that’s not going to solve that, i think. In that way my clean method would still be called with bad data/types. Actually what i would want to achieve is the inverse of that, i don’t want that the clean method to be called if i have any errors, either on clean_fields from the full_clean or from the ModelForm (that may had validation errors on the fields and excluded some fields from being validated on full_clean.
I think i could achieve this if i override a few components (Model, ModelForm, and ModelAdmin) myself in order to assert that aren’t any errors before calling clean, but it feels like a lot honestly.
Maybe someone has a similar use case to this?

If you do not call clean when there is at least one error, you may miss some errors in the final result. For example, if the errors (from clean_fields) are only for the name field (e.g. too long value), then by not calling clean, your model won’t return a potential error for ends_at being before starts_at.

However, your use case is legitimate and it may appear cumbersome to check for each field implied in a cross-checking whether the check can be achieved or not (in your case, that would mean verify if starts_at and ends_at are instances of datetime.date before doing comparison).

Another approach if you want to do cross-check only if implied fields are not in errors could be to override clean_fields like this

def clean_fields(self, excluded=None):
    errors = {}
    try:
        super().clean_fields(excluded=excluded)
    except ValidationError as e:
        e.update_error_dict(errors)
    try:
        if "starts_at" not in errors and "ends_at" not in errors and self.starts_at > self.ends_at:
            raise ValidationError({"ends_at": _("...")})
    except ValidationError as e:
        e.update_error_dict(errors)
    if errors:
        raise ValidationError(errors)

But checking whether fields to check are in errors like in the above is not really simpler than writing:

def clean(self):
    if isinstance(self.starts_at, datetime.date) and isinstance(self.ends_at, datetime.date) and self.starts_at > self.ends_at:
        raise ValidationError({"ends_at": _("...")})

Moreover, the latter has the advantage of checking consistency between starts_at and ends_at if both are dates but one is invalid (for example because it is beyond a max value you may have define on those fields with a validator). To achieve the same with the clean_fields override example above, you not only have to check whether fields are in errors or not, but you also need to check what kind of error happened.

Hope that makes sense.

Thanks for replying,

I mean, this i like doing the same validation as the field already (not counting that the field may coerce the value to the expected type), and if the users sends a None value my field would already took care of that validation, as i already set him to do it, so it seems like it’s a duplication and not trusting on the field validation itself.
The key point of this topic is that the clean method can receive bad data, and i think that this can lead to tricky errors. On this specific model for example, if the user opens the Admin, or any other FormView with a ModelForm, do not fill any fields and hit Submit if a have this clean check

    def clean(self):
         if self.starts_at > self.ends_at:
            raise ValidationError(
                {"ends_at": _("The ends at date must be greater or equal than the starts at")}
            )

We get a TypeError and a user receives a 500 Internal Server Error because this validation was comparing two None values, and most importantly, the ModelForm, already had collected errors from the fields itself, so we already knew that both the starts_at and ends_at were invalid, there’s no need to check it again.

What it would make sense to me is that the clean method would not be called on that situation, but looking at the source code this happens to keep consistency between model.clean and form.clean.

I would love to see an additional hook to validate data, but only data that we shouldn’t care about types, and values that are not compatible with fields definitions.

1 Like

I’ve just come across this, as well. I only want to run my model’s clean() if clean_fields() has not resulted in any errors.

The key point of this topic is that the clean method can receive bad data, and i think that this can lead to tricky errors.

Yes, this is the crux of the issue. Here’s my situation, which is a bit more complex than just whether a requiredfield is populated or not:

I am using wagtail-color-panel to add two fields to my model that takes a color value in hex format. It does its own validation on the field that the format is correct.

In my model’s custom clean(), I want to calculate the contrast ratio between the two color fields to ensure that it meets WCAG contrast guidelines. If the value of one of the color fields isn’t in the correct format, my clean() method shouldn’t ALSO have to validate that, duplicating the field validation already provided by the wagtail-color-panel field, but as of now, I do have to.

Well, you can see some dirty code that i wrote a few months ago here.
Was tested on django4.2

Just be aware that this can lead to some “strange” errors, ones that you should be aware of:
When raising an error inside the clean_valid_data don’t use a field that is not defined on the form, otherwise you will get a TypeError (not sure the exact error type right now). It would be better to raise just “nonfield-errors” there. If you’re putting this inside your project, you better be good at debugging some internal django-code.
Hope this helps you achieve your goal.