Why can't I call full_clean on data bound for a non-default database?

hepcat72 · February 19, 2023, 4:20pm

Background: I modified our code-base awhile back to make use of multiple databases for validation of user-submitted data. This is for automated checks so that they can fix simple issues with their files, like checks on uniqueness and such.

Our original code (not written by me) had some fundamental issues with loading. It had side-effects. It should have used transaction.atomic, but didn’t and simply adding it broke the codebase. I have actually moved on to a proper refactor of the code to eliminate the need for a second database, but there was 1 issue with the 2-database design that I never figured out and I hate not understanding things, so I thought I would ask…

Question: I created a second database (with the alias “validation”) and inserted .using(db) and .save(using=db) in all the necessary places in our code, so that loading data can be tested without risking the production data.

Everything works as expected with the 2 databases except calls to full_clean(). Take this example:

new_compound_dict = {
    name="test",
    formula="C1H4",
    hmdb_id="HMBD0000001",
}
new_compound = Compound(**new_compound_dict)
new_compound.full_clean()
new_compound.save(using="validation")

It gives me this error:

django.core.exceptions.ValidationError: {'name': ['Compound with this Name already exists.'], 'hmdb_id': ['Compound with this HMDB ID already exists.']}

I get the same error with this code:

new_compound, inserted = Compound.objects.using("validation").get_or_create(**new_compound_dict)
if inserted:
    new_compound.full_clean()

Both examples above work without a problem on the default database.

I looked up full_clean in the docs for django 3.2, but I don’t see a way to have it run against a database other than default, which I’m assuming is what I would need to do to fix this. There’s not even a mention of any potential issues related to a non-default database that I can find. I had expected the doc to show that there’s a using parameter to full_clean (like the one for .save(using=db)), but there’s no such parameter.

I debugged the above examples with this before and after each example block:

Compound.objects.using("default").filter(**new_compound_dict).count()
Compound.objects.using("validation").filter(**new_compound_dict).count()

For the default database, the counts are 0 before and 1 after with no error. For the validation database, the counts are 0 and 1, but with the error mentioned above.

At this point, I’m confounded. How can I run full_clean on a database other than the default? Does full_clean just fundamentally not support non-default databases?

Footnote: The compound loading code loads data into both databases. That data is never submitted by users. It is necessary that the compound data be in both databases in order to validate the data submitted by the user, so the compound load script is one of 2 scripts that loads data into both databases (so that it’s in the validation DB when the user submits their data). The default load always happens before the validation load and when I load the validation database, the test compound is always present in the default database and is only present after the validation load in the validation database.

KenWhitesell · February 19, 2023, 7:49pm

This is an interesting topic. I don’t have any direct knowlege or experience in this area, but I was curious enough about it to see how far I might be able to get to figure something out.

In Django 3.2.9 (the version of 3.2 that I currently have installed), it looks like the unique test for a model is performed using the Model._default_manager.

I would guess that you would probably be able to create a custom manager that checks for a specific attribute on a model instance if that instance is to be used with a non-default database.

This ought to cover those cases where a query or function involves a manager. If you’re calling save directly on the model instance, an overriden save method on the model would be able to do the same thing.

This is just the first idea that came to mind, there may be an easier way to do it.

You might also be able to use a custom router to check an instance, if the function being used provides the instance “hint” to the router.

hepcat72 · February 20, 2023, 1:05am

This is reassuring to hear. Others on my team lacked confidence in my code and questioned my assertions that full_clean wouldn’t work on a non-default database. I think they assumed I’d done something wrong. I’d started to talk about making a custom router, but even then, there were concerns that there could be places in our model methods where something might happen on the wrong database that we hadn’t accounted for, so I grudgingly agreed to disable the validation interface until the refactor to fix the loading code properly.

So it’s validating to hear that my hunch about full_clean was on the right track.

manudar · June 17, 2024, 12:12pm

HI, any idea on how to solve this issue. I do need to work on different database connections (to the same database, but different connections) and I am running into the same problem. I tried to create my own Custom manager and even my own Custom router, but I am very new to Django so I cannot make it work. I mean it works exactly the same as with the default manager and router.

KenWhitesell · June 17, 2024, 1:10pm

Welcome @manudar !

If you’re having an issue with multiple databases and routers, I suggest you open a new topic for your discussion. Please include all the important details, especially a complete description of what it is that you’re trying to accomplish and what specifically is not working.

hepcat72 · June 17, 2024, 2:00pm

To answer your direct question, while I think I did have some success with a custom router, I eventually solved the problem by eliminating the second database (by wrapping all the loading code in atomic transactions, so that data validation had no side-effects).

manudar · June 18, 2024, 10:58am

I created this topic

Topic		Replies	Views
full_clean on non-default connection uses the default db for some checks Using the ORM	0	123	June 18, 2024
Idiomatic way to perform atomic, DB-data-based validation before save Using the ORM	10	779	January 24, 2023
Multiple database foreign key save issue Using Django	15	5838	November 3, 2020
Fields with db_default fail on full_clean() Using the ORM	4	477	February 29, 2024
How to temporarily route queries originating from 1 script run to a different database with the same model structure? Using the ORM	4	318	November 4, 2022

Why can't I call full_clean on data bound for a non-default database?

Related topics