Having some trouble understanding how prefetch_related works

I’m having trouble translating specific joins to Django (original code in c#). The culprit is simply that I really lack DB knowledge and what’s going on under the code. I will spend some time today trying to understand joins better, but I’d really like to get over this rut. :slight_smile:

For example I see this kind of code:

db.someTable1.Join(
  db.someTable2,
  l => new { l.machineNumber, l.clientId },
  r => new { r.machineNumber, r.clientId },
  (l, r) => new 
  {
    historyId = l.id,
    l.machineNumber,
    l.clientId,
    ...
  }
)

The DB tables do not contain any foreign keys and I am not able to modify the database. I think this means that select_related is out of the question.

My attempt is

modelForTable1.objects.prefetch_related('machinenumber__machinenumberids_set', 'clientid__machinenumberids_set')

My 3 questions are:

  1. Am I on the right track?
  2. If I am doing joins with more than one table, should I be chaining them together? Or should I put putting everything as parameters into the method? For example modelForTable1.objects.prefetch_related('machinenumber__machinenumberids_set', 'clientid__machinenumberids_set', 'clientid__anothertable_set')
  3. Is there a better way of doing joins that do not modify the existing DB?

Thanks in advance!

The select_related and prefetch_related functions have nothing do to with being able to join tables. They are only a performance-enhancement facility. (Well, not strictly true, but close enough for this discussion.)

You don’t specify joins. The ORM will join tables as necessary based on ForeignKey fields.

Now, if you don’t have any foreign keys, then you can’t join tables together directly. Depending upon what the precise needs are, you might be able to augment results using aggregation or annotation. But as often as not, you’ll still end up needed to do extra queries for this.

Hm I’m convinced that the above does have to do with joins! Django Docs say that select_related works by creating an SQL join and including the fields of the related object in the SELECT statement, but maybe I misinterpreted your statement.

It sounds like the best thing I can do is aggregation or annotation. Maybe it’s fine to take a performance hit.

Actually, after a bit more searching, I came across this post where the top answer suggests that: The Django ForeignKey is different from SQL ForeignKey. Django ForeignKey just represent a relation, it can specify whether to use database constraints.

Does that mean I can actually just change my models to have foreignkey fields and it won’t affect accessing the DB and whatnot?

I think that may be the case here.

[Emphasis changed]

While select_related will cause the ORM to use joins in cases where it otherwise wouldn’t, it isn’t the only clause causing the ORM to use a join, and prefetch_related does something completely different.

Yet another case where SO may be doing more harm than good.

No, that interpretation is incorrect. A Django ForeignKey field is a database foreign key.

No, this is not correct. Your models are a Python representation of a database table, with the added possibility of adding functionality at the Python layer. Every field defined in your model must correspond to a column in the table.

That all really depends upon the specific query and the quantity of data involved. Whether or not there’s really going to be a performance hit is going to depend upon many factors. The PostgreSQL query planner really is quite amazing.