Multiple database foreign key save issue

Hi

I’m currently working on a multiple database setup with same models/tables on all of the databases. I have 2 models which are simplified

class Person(models.Model):
    username = models.CharField(max_length=100)

class Property(models.Model):
    person = models.ForeignKey(Person, on_delete=models.PROTECT)

Assume that there are 2 database with alias A, B. And this is the method i was unittesting

def create_property(person, database_alias):
    property = Property(person=person)
    property.save(using=database_alias)
    return property

So i was thinking of a test where person model object is from database A, and database_alias is B, and I assumed that django prevents this kind of relation with some kind of error raised, so i wrote the test case with initial data as given below.

Initial data:

Database A
Person table
id        username
1         person1

Database B
Person table
id        username
1         person1

Test code

person = Person.objects.using("A").get(id=1)

property = create_property(person, "B")

But it turns out that it actually worked and no errors were raised.

  1. Am I using .save and model object creation in a wrong way? or does django doesn’t check if foreign key instance is not in same database as property?
  2. If I check property._state.db it is from database “B” but if I check property.person._state.db it is from database “A”, so is this really a cross db relationship?
  3. Should I explicitly have a check like person._state.db == database_alias inside the create_property function to avoid this or does django provide some easy way to do this.

I did realise that if the person record (id =1) in database B doesn’t exist then .save throws an error stating that it doesn’t exist so it does check whether that foreign key exists in that same database but how on accessing the foreign key its accessing database A’s record here?.

I would really like to know the reasons for all these questions, I wasn’t able to find anything in the docs except for this information (https://docs.djangoproject.com/en/3.1/topics/db/multi-db/#selecting-a-database-for-save)

Any input is appreciated! Thank you so much

Always remember that underneath the covers, the ORM is creating SQL statements to be issued to the database.

What that means here is that since the “ForeignKey” field in the model is only the PK of that related field, it does not maintain the database instance of that object. So your item #2 is correct - you are not creating a cross-db relationship. In fact, if you were to assign a different person to the other database, you would see that that FK is going to access that different person.

So no, there is no cross-database activity being performed. You can verify this by watching the actual queries being issued when you do this.

Ken

1 Like

Thank you for the response!

I understood now that it refers to the specific field not the whole object itself.
If its not a cross-database relationship, then can you explain what is happening in this case below,

When I save a Person model instance (lets say on A) as FK to a different database (database B) on Property model, I think it does translate into referring to database B’s Person instance (its primary key) which has same pk value, instead of database A’s Person instance on SQL, as saving a different Person model instance from database A whose PK doesn’t exist on database B is prevented with an IntegrityError being raised by django like so,

DETAIL:  Key (person_id)=(2) is not present in table "someapp_person".

But when I get the Property model instance which is saved in B and refer to this foreign key’s data, the data is actually from database A, which makes me think, Is Django actually doing the lookup in a wrong way due to me not using save correctly?

Thank you

In the absence of seeing specific code (models, queries) of sufficient detail to recreate what you’re trying to explain, I hesitate to try and provide any kind of authoritative answer.

I would start with examining the tables directly in your database to verify that the contents are what you expect them to be.

With all the caching that occurs and work that Django does automatically, there are too many possibilities for me to address them all outside the context of a specific example.

1 Like

The scenario I’m explaining is as follows,

class Person(models.Model):
    username = models.CharField(max_length=100)

class Property(models.Model):
    person = models.ForeignKey(Person, on_delete=models.PROTECT)


person1 = Person(username='helloA')
person1.save(using='A')

person2 = Person(username='helloB')
person2.save(using='B')

# We have id=1 person in database A and id=1 person in database B now.

property = Property(person=person1)
property.save(using='B')

print(property.person.username) # Prints helloB instead of helloA, why?

The field property.person has the value 1. Property was saved in B, so the fk relationship will be to Person with id =1 in B -> helloB. No cross-database relationship exists.

If you then save property into A, you should see that property.person.username is helloA.

1 Like

Hi,

Sorry I made a mistake, the last line actually prints helloA instead of helloB, how?

If that’s the case, it’s probably because Django has that value cached, and no query is being executed for that reference.

I suggest you run manage.py shell_plus —print-sql to see the queries being executed. It’s going to make things a lot clearer to you when you can see what’s going on. (Shell_plus is part of the Django-extensions package.)

1 Like

To quickly follow up, my guess would be that if you were to do the following two lines after the print statement:

print(property.person.username)

property_2 = Property.objects.using("B").get(id=1)
print(property_2.person.username)

You would see it printing helloB, giving you the situation where
property.id == property_2.id
but
property.person.username != property_2.person.username
(I’m not in a position where I can check this right now, but it’s my guess.)

1 Like

Hi,

Yea I did this first thinking maybe I assigned the whole object to the Property model __init__ and saving it so its referring to the model (Person from db A) in that print statement.

But on refetching it from database (with both .refresh_from_db and .get), I was astonished that it still pointed to helloA username on database A, I even printed _state.db of property.person._state.db it said A, but property._state.db says B, So thats why I’m totally confused.

I think I’ve lost track of what your question is at this point.

An FK is the PK of the related table. At the database layer, there is nothing indicating a database.

A Django Model Object does track the database. That you have an Object associated with DB “A” with a reference to an Object associated with DB “B” does not in any way mean that there’s a cross-database relationship between the two.

1 Like

Yes from the explanation, I did understand that its not a cross-database relationship as in the database layer itself it just stores the PK of the related field, I’m just suprised that,

  1. Django doesn’t check the model object given as input to be saved to a different database (with using) and raise error, as Django does know that this foreign key object is from different database model from the using keyword argument.
  2. Django on retrieving the property row, on lookup it looks at a different database even though its not a cross database relationship.

Thats why I asked if I’m in anyway using the .save or the model object initialization incorrectly (this case is not documented as a warning in documentation).

So I just want to know if anyone has faced this problem and has dug into the source code to know what does happen when I do save using one database, with a foreign key object from a different database (database querywise its fine, since it only takes the primary key of the FK field.) and why Django doesn’t raise an Exception.

1 Like

So I’ve done some more digging, and the general principles are documented in the Cross-database relations paragraph on the Multiple Databases page.

Briefly, as it relates to your questions here, it is possible to create an environment where, on some databases, it is possible to create something that effectively works as an FK across databases - but it’s not something that Django officially supports. However, since it is possible under the right circumstances, it shouldn’t be something that throws an error.

But, regarding your point #2, I don’t see where you’ve demonstrated that that is what has actually happened. Nothing you’ve described previously indicates that that’s what’s happening in the absence of something previously being loaded / cached.

For example, you mentioned that you used “refresh_from_db”. But, that’s not going to change anything, because the FK you’re retrieving from the database doesn’t change. It’s still the same value, and so it’s not going to change what it’s referring to. Likewise with the get. Without specifying a select_related - and possibly not even then - again, it’s not going to retrieve the related object, which is the object being questioned here.

To gather more information:

  • re-run your examples using shell_plus from django-extensions with the --print-sql option
  • when you’re looking at objects, don’t just look at the values in the object - look at the id of the object. (I’m not talking about Django’s ID field, I’m referring to the python id function for that model). This will let you know whether or not you’re looking at the same object.
  • After you’ve created the situation you’ve described, try performing the get to a fresh variable to avoid seeing effects due to cacheing.

Through all of this, pay particular attention to the queries being generated.

1 Like

I can’t thank you enough for your well thought out response, I will try those steps mentioned.

However, since it is possible under the right circumstances, it shouldn’t be something that throws an error.

I must save this somewhere, this was my assumption which led me to doubt the cross-database django issue.

To expand on my previous answer a little bit and to give you a real-live example (which I just remembered this morning - I wasn’t thinking about this project yesterday).

You don’t need to restrict your usage of Django’s support for multiple databases to those cases where you’re actually referencing a different database.

I’ve got a situation where we’re working with pre-existing data in a database under very strict access rules. We’re given a set of credentials having read-only access to particular tables and views. We must use those credentials to access those tables, which are different than the ones being used for the rest of the system.

We do this by defining another database entry using those restricted credentials and some custom routers.

I don’t remember right off-hand whether or not we actually create a relationship between any of our tables and the restricted-access tables, but it’s logically possible. (I don’t think I’ve even looked at that project in 3 years and I’m not sure if it’s still being used…) In that situation, what Django would see as a cross-database reference really isn’t - they’re two tables in the same database.

2 Likes

Oh this looks interesting, so you define a different database alias with those restricted credentials, with appropriate db routers in place to select the right db for a query, even though the two tables belongs to different db alias, actually they belong to same db. Looks like this might be of some use to me in the future.

Thank you for your guidance!

1 Like