Prefetch 'to_attr' attribute results in an Attribute not found error

Hey folks,

I have a model setup like this

class B(models.Model):
    name = models.CharField(max_length=100)

class A(models.Model):
    b = models.ForeignKey(B, on_delete=models.CASCADE)

class C(models.Model):
    b = models.ForeignKey(B, related_name='c', on_delete=models.CASCADE)

and I am trying to run a query

qs = A.objects.filter().prefetch_related(Prefetch('b__c', queryset=C.objects.all(), to_attr='test'))

but when I try to access qs[0].test I get an Attribute error saying model has no attribute test

I’m not sure what I am doing wrong here and I’m on 4.2

I believe you need to access qs[0].b.test because the prefetched relation is related to instances of B.

2 Likes

hey, thanks, this was it, I assumed it would be accessible from the parent queryset since that’s where we are calling the Prefetch from from.

Is there support for traversing multiple many-related relations? I’m guessing no?

For example, I just tried:

In [5]: for rec in ArchiveFile.objects.prefetch_related(Prefetch("peak_groups", to_attr="pgs"), Prefetch("peak_groups__msrun_sample__sample__animal__studies", to_attr="studs")).distinct():
   ...:     print(f"rec.filename", end=None)
   ...:     for pg in rec.pgs:
   ...:         for s in pg.msrun_sample.sample.animal.studs:
   ...:             print(f"\t{pg.name}\t{s.name}")
   ...: 
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-5-f65b9c2d3baa> in <module>
      2     print(f"rec.filename", end=None)
      3     for pg in rec.pgs:
----> 4         for s in pg.msrun_sample.sample.animal.studs:
      5             print(f"\t{pg.name}\t{s.name}")
      6     for mz in rec.mzs:

AttributeError: 'Animal' object has no attribute 'studs'

(What I really want, incidentally, is just a unique list of studies associated with an archive file, i.e. I don’t actually want the peak_groups. In fact, msrun_sample has 2 links to archive file, so to take it a step even further, I want like, a coalesce of the studies from those 3 connections.)

There definitely is support. In your example you’ve mixed “peak_groups” and “pgs”. Either change your second prefetch to begin from “pgs”, or remove the first prefetch and iterate through peak_groups.all().

I’m excited to try this out when I get to work! But let me make sure I understand what it is you are suggesting…

Are you saying that in the field path in the second prefetch, I can refer to the attribute created in the first prefetch, even when both prefetch objects are arguments to the same preferch_related call?

I guess this is all a little bit moot, however. Regardless of whether or not I create these attributes, since what I want is a unique list of studies associated with the archive files, I would have to iterate through all peak groups and studies to build a unique list of studies associated with each archive file. My intention with creating the attribute was so I wouldn’t have to construct that unique list via iteration, but I see that’s unavoidable.

I think I had a possibly invalid assumption that got me here… I had initially tried building the unique list by iterating in the template. When I moved that logic into the view, I observed a significant performance hit, though it may not have been an apples to apples comparison, because later I couldn’t reproduce the performance difference.

I just installed the Django debug toolbar yesterday and am rewriting the prototype, in a better way, so I’ll eventually figure out if my prefetches are working as I intend.

Yep.

what I want is a unique list of studies associated with the archive files

Sounds like Study.objects.filter(animal__sample__...archive_file__isnull=False)?

:exploding_head:

Well, not collectively - per archive file row in the results. Each row will have a delimited list of linked study names. But maybe that’s what you mean? So something like?:

ArchiveFile.objects.prefetch_related(Prefetch("peak_groups__msrun_sample__sample__animal__studies", queryset=Study.objects.filter(animal__sample__...archive_file__isnull=False), to_attr="studs")).distinct()

That doesn’t seem right. I must be extrapolating your meaning too far… I’ll still have to iterate over all peak groups and studies for each archive file row to build the row’s unique set, unless I do that query for each row and provide ...archive_file__exact=obj.pk.

So let me ask this… is there a (significant) speed difference between:

  1. ArchiveFile.objects.prefetch_related("peak_groups__msrun_sample__sample__animal__studies").distinct() followed by iterating to build the distinct set.
  2. ArchiveFile.objects.prefetch_related(Prefetch("peak_groups__msrun_sample__sample__animal__studies", queryset=Study.objects.filter(animal__sample__...archive_file__isnull=False), to_attr="studs")).distinct() followed by iterating to build the distinct set.

They prefetch the same things, don’t they? If I had a filter on ArchiveFile, then maybe the Study prefetch queryset would make a difference?

It’s the iterating over the peak groups that takes all the time, because usually there are many peak groups that link to each archive file, and since the attribute is attached to Animal, I can’t get around that.

I was able to effect a significant speed-up by adding a study count annotation (e.g. .annotate(study_count=Count("peak_groups__msrun_sample__sample__animal__studies", distinct=True))) and then break my iterating on each row/result once my unique list of studies reached that number.

I’m just not sure whether there’s something I can leverage that I’m not leveraging yet. I don’t supply the prefetch queryset argument in this implementation yet. I have thought of trying it. I did it for my advanced search page by basically rerooting the field paths of all the search terms, but that didn’t occur to me here since I don’t always have a search term.

So it comes down to the difference in those 2 prefetch_related arguments above…

We’re straying from the original question about finding the result of to_attr. Can you open a new thread with your question about querying? When you do, be sure to mention whether a values() query would be sufficient or whether you want model instances.

EDIT: that thread already existed