I’m having trouble translating specific joins to Django (original code in c#). The culprit is simply that I really lack DB knowledge and what’s going on under the code. I will spend some time today trying to understand joins better, but I’d really like to get over this rut.
For example I see this kind of code:
db.someTable1.Join(
db.someTable2,
l => new { l.machineNumber, l.clientId },
r => new { r.machineNumber, r.clientId },
(l, r) => new
{
historyId = l.id,
l.machineNumber,
l.clientId,
...
}
)
The DB tables do not contain any foreign keys and I am not able to modify the database. I think this means that select_related is out of the question.
If I am doing joins with more than one table, should I be chaining them together? Or should I put putting everything as parameters into the method? For example modelForTable1.objects.prefetch_related('machinenumber__machinenumberids_set', 'clientid__machinenumberids_set', 'clientid__anothertable_set')
Is there a better way of doing joins that do not modify the existing DB?
The select_related and prefetch_related functions have nothing do to with being able to join tables. They are only a performance-enhancement facility. (Well, not strictly true, but close enough for this discussion.)
You don’t specify joins. The ORM will join tables as necessary based on ForeignKey fields.
Now, if you don’t have any foreign keys, then you can’t join tables together directly. Depending upon what the precise needs are, you might be able to augment results using aggregation or annotation. But as often as not, you’ll still end up needed to do extra queries for this.
Hm I’m convinced that the above does have to do with joins! Django Docs say that select_related works by creating an SQL join and including the fields of the related object in the SELECT statement, but maybe I misinterpreted your statement.
It sounds like the best thing I can do is aggregation or annotation. Maybe it’s fine to take a performance hit.
Actually, after a bit more searching, I came across this post where the top answer suggests that: The Django ForeignKey is different from SQL ForeignKey. Django ForeignKey just represent a relation, it can specify whether to use database constraints.
Does that mean I can actually just change my models to have foreignkey fields and it won’t affect accessing the DB and whatnot?
While select_related will cause the ORM to use joins in cases where it otherwise wouldn’t, it isn’t the only clause causing the ORM to use a join, and prefetch_related does something completely different.
Yet another case where SO may be doing more harm than good.
No, that interpretation is incorrect. A Django ForeignKey field is a database foreign key.
No, this is not correct. Your models are a Python representation of a database table, with the added possibility of adding functionality at the Python layer. Every field defined in your model must correspond to a column in the table.
That all really depends upon the specific query and the quantity of data involved. Whether or not there’s really going to be a performance hit is going to depend upon many factors. The PostgreSQL query planner really is quite amazing.
Please explain, how is this false? (Please include your references.) Also please ensure that you have read the complete thread, and are addressing this in the full context of everything I have written above.
There is no physical need to ever use select_related or prefetch_related aside from their performance enhancements.
I must admit, @KenWhitesell, that I don’t know what you mean. This:
(whichever part you emphasise) reads to me like “select_related and prefetch_related do not create queries that use joins”. Which is, I think, also how @alxvallejo is reading it.
But you’re saying that’s not what you mean. I believe you, but I don’t understand what you do actually mean. Could you expand on it a little so that we understand?
Warning: This is going to be a long answer, because a full explanation requires both some background and context.
First, it should always be understood that every answer I write here is targeted specifically to the question being asked, and in some cases, to the questioner.
Reading an answer without first reading the question (or the complete thread to that point) can be misleading. Likewise, looking at part of an answer, without considering the response as a whole, is also likely to be misleading.
Now, to establish the context behind my answer in this specific situation, I’m going to quote (and extract) what I think are the most relevent parts of the original post.
…
My interpretation of all this is that the OP is thinking that they can create SQL joins between unrelated tables (no foreign keys) using either or both prefetch_related and select_related.
The implications of that, lead me to believe that the OP may be of the belief that it’s those two functions that create the ability to join two tables. It’s that belief that I was addressing.
Side Note: This is not an unusual interpretation or line of thinking. There have been about a half dozen other questions that I have answered with this same misconception. Questions where people want to join tables not having any relationship connection between them, and thinking they can use one of those two functions to force the ORM to join them.
To the extent of my knowledge, this is not an accurate belief. (I am quite willing to be corrected on this if my knowledge is out-of-date - my knowledge of the ORM does not go that deep.)
It is not the usage of select_related or prefetch_related that creates the ability to join tables. In the general case, for the ORM to create any join at the SQL layer, there must be an FK somewhere in the mix.
Also take note of the caveat in the last sentence at the end of this oft-quoted paragraph:
This is also to emphasize that I’m not making a universal statement here - this is intended to apply to this question.
Finally, the first sentence of the last paragraph wraps up the original answer:
Again - addressing the original question. You can’t add either select_related or prefetch_related to an ORM statement to join tables that are not related by foreign keys. Neither of those functions create an ability to perform a join.
Then, the other part of this is in my response to the first person asking about my original answer:
This is to emphasize the point that select_related does not create the ability for the ORM to use a join. It is the foreign key that creates that ability. The select_related function instructs the ORM to use that ability to join the tables. (And, that there are many other ways in which a join can be created automatically, such as in filter, annotation, or values functions. )
Close, but my statement is more broad than that. If you wanted the briefest possible phrasing, it would be:
There’s no need to qualify that statement with the “do not already have …”, the short version stands on its own. Those functions do not create the ability to join tables.