Incorrect Raw SQL Column References

Hi All,

I’ve stumbled across an issue in my web application and I’m not sure whether this is a bug or a fault of my own.

Let’s start with some models (these only include the relevant attributes for brevity).

# in django app called users
class Contributor(models.Model):
    uuid = models.UUIDField(primary_key=True, default=uuid.uuid4)
    name = models.CharField(max_length=512)
    user = models.ForeignKey(CustomUser, null=True, on_delete=models.CASCADE)
    university = models.ForeignKey(University, null=True, on_delete=models.CASCADE)

    def __str__(self):
        return self.name


# in django app called cases
class Case(models.Model):
    uuid = models.UUIDField(primary_key=True, default=uuid.uuid4)
    contributing_person = models.ForeignKey(Contributor, on_delete=models.SET_NULL, null=True)

It would appear that the necessary migrations have been made (again, edited for brevity)

# users app migration
    operations = [
        migrations.CreateModel(
            name="Contributor",
            fields=[
                ("uuid", models.UUIDField(default=uuid.uuid4, primary_key=True, serialize=False)),
                ("name", models.CharField(max_length=512)),
                (
                    "university",
                    models.ForeignKey(
                        null=True,
                        on_delete=django.db.models.deletion.CASCADE,
                        to="users.university",
                    ),
                ),
                (
                    "user",
                    models.ForeignKey(
                        null=True,
                        on_delete=django.db.models.deletion.CASCADE,
                        to=settings.AUTH_USER_MODEL,
                    ),
                ),
            ],
        ),
    ]
  
 
# cases app migration
    operations = [
        migrations.AlterField(
            model_name="case",
            name="contributing_person",
            field=models.ForeignKey(
                null=True, on_delete=django.db.models.deletion.SET_NULL, to="users.contributor"
            ),
        ),
    ]

When I create a Case object where contributing_person = None the everything runs as expected. However, when I create a Case where contributing_person points to a User object, I get the following exception.

column cases_case.contributing_person does not exist
LINE 1: ...eography_id", "cases_case"."contributing_uni_id", "cases_cas...
                                                             ^
HINT:  Perhaps you meant to reference the column "cases_case.contributing_person_id".

I dove into my database to take a look at my cases_case table to see what the column was called.

my_database=# select contributing_person_id from cases_case;
 79b1b921-0668-4394...
 79b1b921-0668-4394...
 
my_database=# select contributing_person from cases_case;
ERROR:  column "contributing_person" does not exist
LINE 1: select contributing_person from cases_case;
               ^
HINT:  Perhaps you meant to reference the column "cases_case.contributing_person_id".

As expected, there is no contributing_person column, but rather a contributing_person_id column.

What I find strange is that there is data in the table which looks to be sane. However, when I try to add some data now, I get the above exception.

I have had some migration issues which I think resulted from some duplicate files resulting from a git operation. I had a file called app_migration 2.py which made its way into my database. I updated the row in the django_migrations table to point to the original file app_migration.py. I reverted to a previous commit before I had deleted the app_migration 2.py and compared it with app_migration.py. They are identical.

I does strike me as odd, after taking a glance at the django source code (I’m not saying it was at all comprehensable for my noggin) but I did notice that the models.ForeignKey object create FKs column names in the DB with an _id appended to the model’s attribute’s name.

I suppose what I am getting at is why is Django trying to insert something into contributing_person column when it appears it should actually be contributing_person_id? From where is Django getting the contributing_person column name? I’d like to work out whether this is my fault, and if so, how to fix it, or if it is a Django bug so I can go and report it.

Any help as always is warmly received. Thank you.

Cheers,

Conor

This is working as designed by Django. You can find more information in the docs here. You can override that field using db_column (docs), but I’d recommend using the default functionality unless it makes sense not to. The reason being is that most Django devs would expect that, which makes the application easier to understand and reason about.

It’s good to remember that this is an ORM. It’s doing a fair amount of translation from the database to the data layer in your application. The specifics aren’t going to match completely. Django takes an opinionated approach which allows folks to get running quickly. Although it does make things a bit more complex when you do have to drop down to the database level.

Hi CodenameTim,

Thanks for the reply. I agree, it looks like Django is doing exactly as it should. The point that I’m confused about is why Django/my code is attempting to insert some data into a column which doesn’t exist in a table which was created by the ORM.

The code that is creating the problem is:

            # the below line has a Contributor object returned to contributing_person
            contributing_person, _ = self.process_contributor(case_dict.pop("contributing_person"))

            case, created = Case.objects.filter(species=species).update_or_create(
                id=data.case,
                species=species,
                defaults={
                    **case_dict,
                    "breed": breed,
                    "image": image,
                    "patient": patient,
                    "geography": geography,
                    "contributing_uni": uni,
                    "contributing_person": contributing_person,
                },

Attributes in the default dict above all reference columns with an _id appended to them. So the big mystery, for me at least, and the question I’m trying to solve is why is the above code trying to access a column name contributing_uni when it should have an _id appended, just like all the other attributes?

Cheers,

Conor

<conjecture>
I believe it has to do with how the model is created in the update_or_create method.
These parameters aren’t part of a query, so they don’t go through the ORM translation layer to become a query, and so the implicit translation doesn’t occur - they’re passed directly to the model class initializer, where the actual field has the _id suffix.
</conjecture>

I’m pretty sure in the past I’ve had to specify the _id suffix on the foreign key fields and reference the id field on the referenced object, so that I would have something like:
"contributing_uni_id": uni.id
(at least that’s what I’m finding in my most recent project using this)

Ken

Is that code the actual code that’s generating the error from your original question? Your question indicates Raw SQL in the title and I agree if you’re getting those exceptions with the wrong field names, it definitely seems weird. If that’s not the code, could you post that?

@KenWhitesell As a counter to your experience, I’ve been able to use full model instances in the defaults to a call to update_or_create successfully. That said, I typically use the _id approach because I may not have access to the referenced instance, but only the ID.

1 Like

Cool - I just confirmed what you mentioned, in a different project, using the shell. (Makes me wonder about our earlier experience, but that’s a completely different question and not relevant here - and may be that in the case I looked at, we only had the id as well.)

1 Like

Hi Ken, CodenameTim,

Apologies that I didn’t make it clear that I was using the update_or_create() method and it is in fact this code that exception refers to.

@KenWhitesell, like @CodenameTim says, I use both update_or_create() and get_or_create() extensively and I have never come across this issue. Out of curiosity, I did as you wrote, Ken. That worked fine for other get_or_create() methods but not this particular problematic one. I’m starting to suspect that there is a strange bug in my code somewhere that is up to no good.

I am well and truly perplexed at this point but I’m going to dedicate tomorrow and Thursday evenings (this is my side project in attempt to start something new). I’ll get back to this thread when I’m done with the troubleshooting and let you know how I’m progressing.

Cheers to you both for your thoughts and help so far.

Cheers,

Conor

Well, this has been a weird one and I still haven’t understood why it happened. I had three systems all running almost the same code in docker on three different hosts (prod, dev, and very dev), all of which had their own database, albeit it with identical migrations. One of the three, the one with the issue was running on Postgres 11, the others on Postgres 12. I restored a backup of the entire database from one of the functioning systems onto the system with the issue. Problem gone. No idea why. I’d like to find the root cause, but I’m also rather glad that the issue is behind me. Luckily it wasn’t my production system which had the problem.

My conclusion is that there was either an issue with Django + Postres 11 contra 12 (unlikely) or a stuff up somewhere from me (very likely) that caused the issue.

Apologies for wasting your time with my post here, but regardless, thank you for your input.

Cheers,

Conor

1 Like

I suspect you had run a slightly different migration, then pushed a new migration with the same name to that environment throwing the data model out of sync with the database.

I think in the future viewing what the table structure is in the database and comparing with manage.py sqlmigrate could help you find the problem faster.

2 Likes