Prefetch top N Most Recent Related Objects

Hi friends, I understand I can use subqueries and annotations to select the most recent related object’s attribution, but since subqueries cannot return objects, only single columns, I’m trying to think through how to “select the N most recent related objects” for every record in a QuerySet leveraging the ORM.

For example, consider Many Documents to One Ticket model. We can get the latest document id for each ticket easily

latest_documents = Docs.objects.filter(id_main=OuterRef("id")).order_by(
    "-tstamp"
)
Ticket.objects.annotate(
    latest_document_id=Subquery(latest_documents.values("document_id")[:1])
)

But what if I want the entire related object? I understand I can use Prefetch to get all the related Ticket.docs, but after many hours of Googling and experimentation I don’t understand how to limit the prefetch operation to the top 1 or 2 most recent documents for each Ticket.

Consider a Ticket List view and I want to show the top two latest document names and types for every ticket in the same grid. The best I could come up with is a creating a Prefetch(queryset=<qs>) where qs is a Window function filtered rank, something like

    Docs.objects.annotate(
        row_number=Window(
            expression=RowNumber(),
            partition_by=[F("ticket_id")],
            order_by="-tstamp")
        ).filter(
        row_number__lte=2
    )

But that doesn’t feel great either since most databases aren’t great at Window function filtering optimizations (Snowflake’s Qualify By rules).

Anyway, filtering the top N related objects seems like a common task, so I’m clearly missing something! Thanks for any help.

Cheers.

1 Like

Assuming you’re on Django 4.2+ you should be able to simply use Prefetch with a sliced queryset and it will use the proper window functions for you.

Ticket.objects.prefetch_related(
    Prefetch(
        "documents",
        Document.objects.order_by("-timestamp")[:n]
        to_attr="latest_documents",
    )
)
2 Likes

Cheers thanks Charettes. I’ll try it out. I haven’t used Django in half a decade, but I remember you helping me out long ago. Hope you and yours are well!

1 Like

I’ve got TypeError: Cannot filter a query once a slice has been taken.

It’s impossible to provide you any support @OttoAndrey as you didn’t provide a full traceback, a piece of code and associated set of models to reproduce.

As mentioned above the features is only available in Django 4.2+ and is tested against many-to-many fields in both directions as well as reverse foreign keys which are the only multi-valued relationships Django exposes where this feature makes sense.

1 Like

You are right. I’m using Django 5.2.

For example I have three models: User, Post, Like.
Post has fk to User. Post has GenericRelation to Like.

If I do

User.objects.prefetch_related(Prefetch("posts",  queryset=Post.objects.order_by("-pk")[:1], to_attr="latest_posts"))

It is working as expected. Each user will be with one or zero posts.

If I do the same for Post-Like

Post.objects.prefetch_related(Prefetch("likes", queryset=Like.objects.order_by("-pk")[:1], to_attr="last_likes"))
Models
from django.conf import settings
from django.contrib.auth.models import AbstractBaseUser, PermissionsMixin
from django.contrib.auth.models import UserManager as DjangoUserManager
from django.contrib.contenttypes.fields import GenericRelation, \
    GenericForeignKey
from django.db import models
from django.utils.translation import gettext_lazy as _

import citext


class UserManager(DjangoUserManager):
    """Adjusted user manager that works w/o `username` field."""

    def _create_user(
        self,
        email: str,
        password: str | None,
        **extra_fields,
    ) -> "User":  # pragma: no cover
        """Create and save a user with the given email and password."""
        if not email:
            raise ValueError("The given email must be set")
        email = self.normalize_email(email)
        user = self.model(email=email, **extra_fields)
        user.set_password(password)
        user.save(using=self._db)
        return user

    def create_superuser(
        self,
        email: str,
        password: str | None = None,
        **extra_fields,
    ) -> "User":  # pragma: no cover
        """Create superuser instance (used by `createsuperuser` cmd)."""
        extra_fields.setdefault("is_staff", True)
        extra_fields.setdefault("is_superuser", True)

        if extra_fields.get("is_staff") is not True:
            raise ValueError("Superuser must have is_staff=True.")
        if extra_fields.get("is_superuser") is not True:
            raise ValueError("Superuser must have is_superuser=True.")

        return self._create_user(email, password, **extra_fields)


class User(
    models.Model,
    AbstractBaseUser,
    PermissionsMixin,
):
    """Custom user model without username."""

    first_name = models.CharField(
        verbose_name=_("First name"),
        max_length=30,
        blank=True,
    )
    last_name = models.CharField(
        verbose_name=_("Last name"),
        max_length=30,
        blank=True,
    )
    email = citext.CIEmailField(
        verbose_name=_("Email address"),
        max_length=254,  # to be compliant with RFCs 3696 and 5321
        unique=True,
    )
    is_staff = models.BooleanField(
        verbose_name=_("Staff status"),
        default=False,
        help_text=_(
            "Designates whether the user can log into this admin site.",
        ),
    )
    is_active = models.BooleanField(
        verbose_name=_("Active"),
        default=True,
        help_text=_(
            "Designates whether this user should be treated as active.",
        ),
    )

    avatar = models.ImageField(
        verbose_name=_("Avatar"),
        blank=True,
        null=True,
        upload_to=settings.DEFAULT_MEDIA_PATH,
        max_length=512,
    )

    EMAIL_FIELD = "email"
    USERNAME_FIELD = "email"
    REQUIRED_FIELDS = []

    objects = UserManager()

    class Meta:
        verbose_name = _("User")
        verbose_name_plural = _("Users")

    def __str__(self) -> str:
        return self.email


class Post(models.Model):

    user = models.ForeignKey(
        to="users.User",
        on_delete=models.CASCADE,
        related_name="posts",
    )

    likes = GenericRelation(
        to="users.Like",
        related_query_name="post",
    )

    body = models.TextField()


class Like(models.Model):

    user = models.ForeignKey(
        to="users.User",
        on_delete=models.CASCADE,
        related_name="likes",
    )

    content_type = models.ForeignKey(
        to="contenttypes.ContentType",
        verbose_name=_("Content type of object"),
        on_delete=models.CASCADE,
        limit_choices_to={
            "model__in": [
                Post._meta.model_name,
            ],
        },
    )
    object_id = models.PositiveBigIntegerField(
        verbose_name=_("Related object ID"),
    )
    content_object = GenericForeignKey(
        ct_field="content_type",
        fk_field="object_id",
    )
Traceback
In [16]: Post.objects.prefetch_related(Prefetch("likes", queryset=Like.objects.order_by("-pk")[:1], to_attr="last_likes"))

Out[16]: SELECT "users_post"."id",
       "users_post"."created",
       "users_post"."modified",
       "users_post"."user_id",
       "users_post"."body"
  FROM "users_post"
 LIMIT 21

Execution time: 0.004464s [Database: default]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
File ~/work_projects/saritasa-python-boilerplate-django/.venv/lib/python3.13/site-packages/IPython/core/formatters.py:770, in PlainTextFormatter.__call__(self, obj)
    763 stream = StringIO()
    764 printer = pretty.RepresentationPrinter(stream, self.verbose,
    765     self.max_width, self.newline,
    766     max_seq_length=self.max_seq_length,
    767     singleton_pprinters=self.singleton_printers,
    768     type_pprinters=self.type_printers,
    769     deferred_pprinters=self.deferred_printers)
--> 770 printer.pretty(obj)
    771 printer.flush()
    772 return stream.getvalue()

File ~/work_projects/saritasa-python-boilerplate-django/.venv/lib/python3.13/site-packages/IPython/lib/pretty.py:419, in RepresentationPrinter.pretty(self, obj)
    408                         return meth(obj, self, cycle)
    409                 if (
    410                     cls is not object
    411                     # check if cls defines __repr__
   (...)
    417                     and callable(_safe_getattr(cls, "__repr__", None))
    418                 ):
--> 419                     return _repr_pprint(obj, self, cycle)
    421     return _default_pprint(obj, self, cycle)
    422 finally:

File ~/work_projects/saritasa-python-boilerplate-django/.venv/lib/python3.13/site-packages/IPython/lib/pretty.py:794, in _repr_pprint(obj, p, cycle)
    792 """A pprint that just redirects to the normal repr function."""
    793 # Find newlines and replace them with p.break_()
--> 794 output = repr(obj)
    795 lines = output.splitlines()
    796 with p.group():

File ~/work_projects/saritasa-python-boilerplate-django/.venv/lib/python3.13/site-packages/django/db/models/query.py:376, in QuerySet.__repr__(self)
    375 def __repr__(self):
--> 376     data = list(self[: REPR_OUTPUT_SIZE + 1])
    377     if len(data) > REPR_OUTPUT_SIZE:
    378         data[-1] = "...(remaining elements truncated)..."

File ~/work_projects/saritasa-python-boilerplate-django/.venv/lib/python3.13/site-packages/django/db/models/query.py:400, in QuerySet.__iter__(self)
    385 def __iter__(self):
    386     """
    387     The queryset iterator protocol uses three nested iterators in the
    388     default case:
   (...)
    398            - Responsible for turning the rows into model objects.
    399     """
--> 400     self._fetch_all()
    401     return iter(self._result_cache)

File ~/work_projects/saritasa-python-boilerplate-django/.venv/lib/python3.13/site-packages/django/db/models/query.py:1930, in QuerySet._fetch_all(self)
   1928     self._result_cache = list(self._iterable_class(self))
   1929 if self._prefetch_related_lookups and not self._prefetch_done:
-> 1930     self._prefetch_related_objects()

File ~/work_projects/saritasa-python-boilerplate-django/.venv/lib/python3.13/site-packages/django/db/models/query.py:1320, in QuerySet._prefetch_related_objects(self)
   1318 def _prefetch_related_objects(self):
   1319     # This method can only be called once the result cache has been filled.
-> 1320     prefetch_related_objects(self._result_cache, *self._prefetch_related_lookups)
   1321     self._prefetch_done = True

File ~/work_projects/saritasa-python-boilerplate-django/.venv/lib/python3.13/site-packages/django/db/models/query.py:2380, in prefetch_related_objects(model_instances, *related_lookups)
   2377     obj_to_fetch = [obj for obj in obj_list if not is_fetched(obj)]
   2379 if obj_to_fetch:
-> 2380     obj_list, additional_lookups = prefetch_one_level(
   2381         obj_to_fetch,
   2382         prefetcher,
   2383         lookup,
   2384         level,
   2385     )
   2386     # We need to ensure we don't keep adding lookups from the
   2387     # same relationships to stop infinite recursion. So, if we
   2388     # are already on an automatically added lookup, don't add
   2389     # the new lookups from relationships we've seen already.
   2390     if not (
   2391         prefetch_to in done_queries
   2392         and lookup in auto_lookups
   2393         and descriptor in followed_descriptors
   2394     ):

File ~/work_projects/saritasa-python-boilerplate-django/.venv/lib/python3.13/site-packages/django/db/models/query.py:2544, in prefetch_one_level(instances, prefetcher, lookup, level)
   2523 # prefetcher must have a method get_prefetch_querysets() which takes a list
   2524 # of instances, and returns a tuple:
   2525 
   (...)
   2533 # The 'values to be matched' must be hashable as they will be used
   2534 # in a dictionary.
   2536 if hasattr(prefetcher, "get_prefetch_querysets"):
   2537     (
   2538         rel_qs,
   2539         rel_obj_attr,
   2540         instance_attr,
   2541         single,
   2542         cache_name,
   2543         is_descriptor,
-> 2544     ) = prefetcher.get_prefetch_querysets(
   2545         instances, lookup.get_current_querysets(level)
   2546     )
   2547 else:
   2548     warnings.warn(
   2549         "The usage of get_prefetch_queryset() in prefetch_related_objects() is "
   2550         "deprecated. Implement get_prefetch_querysets() instead.",
   2551         RemovedInDjango60Warning,
   2552         stacklevel=2,
   2553     )

File ~/work_projects/saritasa-python-boilerplate-django/.venv/lib/python3.13/site-packages/django/contrib/contenttypes/fields.py:668, in create_generic_related_manager.<locals>.GenericRelatedObjectManager.get_prefetch_querysets(self, instances, querysets)
    665 object_id_converter = instances[0]._meta.pk.to_python
    666 content_type_id_field_name = "%s_id" % self.content_type_field_name
    667 return (
--> 668     queryset.filter(query),
    669     lambda relobj: (
    670         object_id_converter(getattr(relobj, self.object_id_field_name)),
    671         getattr(relobj, content_type_id_field_name),
    672     ),
    673     lambda obj: (obj.pk, self.get_content_type(obj).pk),
    674     False,
    675     self.prefetch_cache_name,
    676     False,
    677 )

File ~/work_projects/saritasa-python-boilerplate-django/.venv/lib/python3.13/site-packages/django/db/models/query.py:1476, in QuerySet.filter(self, *args, **kwargs)
   1471 """
   1472 Return a new QuerySet instance with the args ANDed to the existing
   1473 set.
   1474 """
   1475 self._not_support_combined_queries("filter")
-> 1476 return self._filter_or_exclude(False, args, kwargs)

File ~/work_projects/saritasa-python-boilerplate-django/.venv/lib/python3.13/site-packages/django/db/models/query.py:1488, in QuerySet._filter_or_exclude(self, negate, args, kwargs)
   1486 def _filter_or_exclude(self, negate, args, kwargs):
   1487     if (args or kwargs) and self.query.is_sliced:
-> 1488         raise TypeError("Cannot filter a query once a slice has been taken.")
   1489     clone = self._chain()
   1490     if self._defer_next_filter:

TypeError: Cannot filter a query once a slice has been taken.

Maybe slice doesnt work with generic relation?

Thanks for providing more details, greatly appreciated.

Maybe slice doesn’t work with generic relation?

It’s effectively not implemented but it’s technically feasibly to if you’d like to request the feature to be added.

1 Like

That would be great!