Combine multiple querysets from different models with different fields

Hello,

I have 3 different models (by different I mean that they do not have all the same fields, some are the same).

I found that it’s possible to ‘combine’ querysets from different models that have the same fields.

My question is: is it possible to do the same with different models that do not have the same fields. I yes, can I limit the Queryset the same way I do with a ‘normal’ queryset : Entry.objects.all()[5:10]

Thanks for your help.

I think it’s going to be a lot easier to address your questions if you’re more specific about what you’re trying to achieve. What is a minimal representation of your 3 models?
Then what is the output you are looking to generate? (I’d like to avoid an “X-Y problem” type situation.)

I want to do an readonly API endpoint with a limit/offset based pagination that return a list of items. thoses items come from 3 models.

Workflow

  • I receive a request on the API endpoint url, this request include limit and offset arguments

  • I make a query in 3 different tables (I filter the query to get only published = True, this field is present on each model), limiting the query with the offset/limit (each model has a timestamp_created = DateField(auto_now_add=True, that can be used for sorting) a have a result of x objects

  • with this ‘collection’ of objects, for each object I need to create a dictionary to map the fields present in the model with a unique JSON format (meaning that all models must ‘comply’ with this single JSON format). I also need to compute some data or generate uri that will be added to the dict.

  • In the for loop used above to convert the object data in a dict at the end of the conversion I add the dict to a list

  • When I have the list of dict I will it in view using the django.http.JsonResponse

Models

class Prescription(models.Model):
    reference = models.CharField(max_length=15, unique=True)
    slug = models.SlugField(default=slugify(reference), editable=False)
    sub_code = models.ForeignKey("certificates.SubCode", on_delete=models.PROTECT)
    object = models.TextField()
    pdf_file = models.FileField(
        upload_to="uploads/prescriptions/", default=None, null=True, blank=True
    )
    pdf_file_en = models.FileField(
        upload_to="uploads/prescriptions/", default=None, null=True, blank=True
    )
    pdf_file_fr = models.FileField(
        upload_to="uploads/prescriptions/", default=None, null=True, blank=True
    )
    pdf_file_nl = models.FileField(
        upload_to="uploads/prescriptions/", default=None, null=True, blank=True
    )
    published = models.BooleanField(default=False)

    timestamp_updated = models.DateTimeField(auto_now=True)
    timestamp_created = models.DateField(auto_now_add=True)
    comment = models.TextField(default=None, blank=True, null=True)
    note = models.TextField(default=None, blank=True, null=True)

class Regulation(models.Model):
    reference = models.CharField(max_length=15, unique=True)
    slug = models.SlugField(default=slugify(reference), editable=False)
    type = models.ForeignKey(Type, on_delete=models.PROTECT)
    code = models.ForeignKey(
        "certificates.Code",
        on_delete=models.PROTECT,
        default=None,
        blank=True,
        null=True,
    )
    sub_code = models.ManyToManyField("certificates.SubCode", default=None, blank=True)
    object = models.TextField()
    pdf_file = models.FileField(
        upload_to="uploads/regulations/", default=None, null=True, blank=True
    )
    pdf_file_en = models.FileField(
        upload_to="uploads/regulations/", default=None, null=True, blank=True
    )
    pdf_file_fr = models.FileField(
        upload_to="uploads/regulations/", default=None, null=True, blank=True
    )
    pdf_file_nl = models.FileField(
        upload_to="uploads/regulations/", default=None, null=True, blank=True
    )
    published = models.BooleanField(default=False)

    timestamp_updated = models.DateTimeField(auto_now=True)
    timestamp_created = models.DateField(auto_now_add=True)
    comment = models.TextField(default=None, blank=True, null=True)
    note = models.TextField(default=None, blank=True, null=True)

class Certificate(models.Model):
    ACTIVE = 1
    PENDING = 2
    SUSPENDED = 3
    REVOKED = 4
    STATUS = (
        (ACTIVE, "active"),
        (PENDING, "pending"),
        (SUSPENDED, "suspended"),
        (REVOKED, "revoked"),
    )
    uuid = models.UUIDField(default=uuid.uuid4, editable=False, unique=True)
    company = models.ForeignKey("address_book.Company", on_delete=models.PROTECT)
    contact_email = models.EmailField()
    sub_code = models.ForeignKey(SubCode, on_delete=models.PROTECT)
    status = models.PositiveSmallIntegerField(choices=STATUS, default=PENDING)
    published = models.BooleanField(default=False)
    custom_url = models.URLField(default=None, null=True, blank=True)
    creation_date = models.DateField(default=now)
    pdf_file = models.FileField(
        upload_to="uploads/certificates/", default=None, null=True, blank=True
    )
    internal_id = models.CharField(max_length=25, unique=True)
    timestamp_updated = models.DateTimeField(auto_now=True)
    timestamp_created = models.DateField(auto_now_add=True)
    comment = models.TextField(default=None, blank=True, null=True)
    note = models.TextField(default=None, blank=True, null=True)

Great! That clears up a lot.

In the general sense, I’d be looking to take advantage of as much of the built-in features as possible.

So let’s assume for the moment that your JSON format is straight-forward. You’re building a list of dicts, where the keys are the same regardless of the originating model.

Also assume for the moment that those desired keys are simply named ‘key1’ and ‘key2’.

I’d use annotations and the values method to build the JSON for each query, and concatenate the lists together.

I’d start with something like:

results = (
    list(Prescription.objects.filter(...).annotate(key1=F('reference'), key2=F('slug')).values('key1', 'key2')) +
    list(Regulation.objects.filter(...).annotate(key1=F('reference'), key2=F('slug')).values('key1', 'key2')) +
    list(Certificate.objects.filter(...).annotate(key1=F('company'), key2=F('sub_code')).values('key1', 'key2'))
)

This is just a starting point of course, but it should give you some ideas on how to build from it.

great thank you,

With your approach, If I understand well, I first get all the objects I want and after I do implement a custom pagination ? I do need this limit/offset pagination as it is a requirement of the scrapper client.

before your response, I was evaluating the idea of setting a new model ‘api_objects’ and with a post save signal on the 3 models, trigger the generation of the dict and store it in a JSON field and store this in this new table, then query , filter with limit the normal way.

I don’t like to have twice the same data in the DB, but here we are talking about 200 rows maximum.

What do you think about this approach ?

I can’t definitively answer that from the information you’ve provided. But if you’re only talking about ~ 200 rows, that’s probably the approach I would take.

Depending upon how you need those results ordered or interleaved, you could identify which range needs to be retrieved for any specific request.

For example, if there are 100 rows in each of the three tables, and you get a request for rows 91 - 115, you can calculate that you need the last 10 rows of table 1 and the first 15 rows of table 2, and build your query accordingly.

There are very few situations where I ever recommend replicating data in the database. (And this definitely isn’t one of them.) It’s too easy for the data to get out-of-sync and in the long run has always caused me more problems than it has saved.

Thank you very much

You are always a big help…