Why can't I call full_clean on data bound for a non-default database?

Background: I modified our code-base awhile back to make use of multiple databases for validation of user-submitted data. This is for automated checks so that they can fix simple issues with their files, like checks on uniqueness and such.

Our original code (not written by me) had some fundamental issues with loading. It had side-effects. It should have used transaction.atomic, but didn’t and simply adding it broke the codebase. I have actually moved on to a proper refactor of the code to eliminate the need for a second database, but there was 1 issue with the 2-database design that I never figured out and I hate not understanding things, so I thought I would ask…

Question: I created a second database (with the alias “validation”) and inserted .using(db) and .save(using=db) in all the necessary places in our code, so that loading data can be tested without risking the production data.

Everything works as expected with the 2 databases except calls to full_clean(). Take this example:

new_compound_dict = {
    name="test",
    formula="C1H4",
    hmdb_id="HMBD0000001",
}
new_compound = Compound(**new_compound_dict)
new_compound.full_clean()
new_compound.save(using="validation")

It gives me this error:

django.core.exceptions.ValidationError: {'name': ['Compound with this Name already exists.'], 'hmdb_id': ['Compound with this HMDB ID already exists.']}

I get the same error with this code:

new_compound, inserted = Compound.objects.using("validation").get_or_create(**new_compound_dict)
if inserted:
    new_compound.full_clean()

Both examples above work without a problem on the default database.

I looked up full_clean in the docs for django 3.2, but I don’t see a way to have it run against a database other than default, which I’m assuming is what I would need to do to fix this. There’s not even a mention of any potential issues related to a non-default database that I can find. I had expected the doc to show that there’s a using parameter to full_clean (like the one for .save(using=db)), but there’s no such parameter.

I debugged the above examples with this before and after each example block:

Compound.objects.using("default").filter(**new_compound_dict).count()
Compound.objects.using("validation").filter(**new_compound_dict).count()

For the default database, the counts are 0 before and 1 after with no error. For the validation database, the counts are 0 and 1, but with the error mentioned above.

At this point, I’m confounded. How can I run full_clean on a database other than the default? Does full_clean just fundamentally not support non-default databases?

Footnote: The compound loading code loads data into both databases. That data is never submitted by users. It is necessary that the compound data be in both databases in order to validate the data submitted by the user, so the compound load script is one of 2 scripts that loads data into both databases (so that it’s in the validation DB when the user submits their data). The default load always happens before the validation load and when I load the validation database, the test compound is always present in the default database and is only present after the validation load in the validation database.

This is an interesting topic. I don’t have any direct knowlege or experience in this area, but I was curious enough about it to see how far I might be able to get to figure something out.

In Django 3.2.9 (the version of 3.2 that I currently have installed), it looks like the unique test for a model is performed using the Model._default_manager.

I would guess that you would probably be able to create a custom manager that checks for a specific attribute on a model instance if that instance is to be used with a non-default database.

This ought to cover those cases where a query or function involves a manager. If you’re calling save directly on the model instance, an overriden save method on the model would be able to do the same thing.

This is just the first idea that came to mind, there may be an easier way to do it.

You might also be able to use a custom router to check an instance, if the function being used provides the instance “hint” to the router.

This is reassuring to hear. Others on my team lacked confidence in my code and questioned my assertions that full_clean wouldn’t work on a non-default database. I think they assumed I’d done something wrong. I’d started to talk about making a custom router, but even then, there were concerns that there could be places in our model methods where something might happen on the wrong database that we hadn’t accounted for, so I grudgingly agreed to disable the validation interface until the refactor to fix the loading code properly.

So it’s validating to hear that my hunch about full_clean was on the right track.