Linking 2 tables with 2 non-unique fields that together are unique

I’m trying to figure out how to relate 2 models where they have 2 fields that together form a unique identifier.

e.g.

class Model1(models.Model):
        item_name = models.CharField(max_length=50, null=True, blank=True)
	item_number = models.PositiveSmallIntegerField()
        product_id = models.PositiveIntegerField()

class Model2(models.Model):
        item_name = models.CharField(max_length=50, null=True, blank=True)
	item_number = models.PositiveSmallIntegerField()
        detail1 = models.CharField(max_length=20, null=True, blank=True)

Together the item name and number are a unique identifier. Is there a way to relate these models?

Thanks.

Please be more specific - provide some details about what it is you’re trying to accomplish.

The terms “linking” and “relating” are fairly ambiguous here.

Is there a specific set of results you’re looking for from a query? Or are you trying to address this from a more general perspective?

Sure. The two tables have related, but not identical information. I would like to be able to list data from the two tables in a view - say in a table list, for example. So by querying an item in Model1, you get a list of the details from Model2. Does that make sense?

Yep, that can be done. What you need to factor in is the “disjoint” situation - how do you want to handle those cases where the combination (item_name, item_number) is in Model1 but not Model2? And what about the reverse? How do you want to handle that situation where that combination is in Model2 but not in Model1?

Related to this (because it may make things easier), do you have control over these models? Are these models part of your project, or are they data from a different project?

What are the sizes of these models? (Approximately how many rows in each?) Do these models grow consistently over time, or are they relatively static?

If (item_name, item_number) create a unique natural key, is there a reason these entities are two different models?

I am in control of the models, but not the data. The data is coming from two different sources, but the data from Model2 augments the data from Model1. Model2 should be immutable. Data from Model1 is being added. There should not exist a condition where a record from Model1 does not already exist in Model2. There will be data in Model2 that does not exist in Model1. The size is considerable, say up to 20000 records in each case.

They are in different models only because I’m not sure how to have them in the same model. If that is a better way, then I’ll do that. If I can have Model2 already existing, and then import the data from Model1 to add to it, I’ll do that. Please advise.

How are you getting the data for Model1? Is this data just a 1-time load or do you get new data periodically? Do these data loads come in in batches or one-at-a-time?

The data for Model1 comes in batches periodically. To complicate matters, Model2 also grows periodically. Also in batches.

If I were doing this, I’d be setting up the imports of the Model1 data as updates to Model2.

How are you getting these updates? (Are they files being uploaded? Do you have an API where they’re being submitted? Is someone typing them in manually?)

What is your current process for handling these updates? What is the typical size of these updates? (How many rows are being updated?)

Both files come to me as .csv lists. I have been uploading them with the admin panel using import/export. Model2 is updated quarterly with about 300 entries. Model1 is updated more arbitrarily in batches up to 1000. Model2 is much older and is very large (at least for me).
The question I would have about your approach, then, is that if I use the imports of Model1 as updates to Model2, are there any pitfalls when referring to Model1 data, since I believe all of the other fields (not item_name or item_number) would have to be set to null=True?

I’m not sure. We really haven’t gotten into the discussion of what this data really is, or how it’s being used.

If this is a case where the number of instances of Model 1 is significantly smaller than the instances of Model 2 - or if you need to make the distinction between them in usage, then perhaps another solution is appropriate.

Instead of combining Model 1 with Model 2, you could still have a separate Model 1, but instead of the (item_name, item_number) fields being stored in Model 1, you could create a OneToOneField relating that instance of Model 1 with Model 2.

That sounds appropriate. How would I go about doing that? And thanks very much, btw!

Create the Model 1 with the one-to-one field as the primary key to Model 2 instead of the (item_name, item_number) fields. Then, for processing the individual rows of the updates, use the update_or_create method on Model 1 to either update the existing row or create a new row as appropriate.

Since you’re effectively creating a “multi-table inheritance” situation here, you could also define Model 1 as a subclass of Model 2. (That may make some of your subsequent processing a little more “intuitive”.)

I tried defining Model1 as a subclass of Model2, but I’m not sure it does what I need. When I create (or import) a record for Model1, it creates a new entry in Model2. But I need the item_name and item_number to form a unique set. Essentially, I want to have all of the fields populated in Model2, and then add the related data in Model1 as it comes along. Can I do this with the table inheritance? Am I just not doing it right? Or should I try the other suggestion you made with the one-to-one field? Thanks.

I think the simplest, and best solution is to concatenate the two fields before import in both datasets - in order to create a new primary key. This much is within my wheelhouse. Otherwise, I’m getting into weeds that would take me considerable effort to figure out.

Sorry, somehow I lost track of this thread.

Yes, absolutely. You can create the Model2 instance with its data, then come back at any later point to create the Model1 instance to be associated with it.

Also a perfectly valid solution! Under these circumstances, I can see this being a case of “Practicality beats purity”. There are times when “good enough” is good enough.