first() Method Does Not Use prefetch_related

I’ve discovered that when using Django’s first() method on a queryset that has prefetch_related applied, the prefetched relationships are not utilized in the returned object. This leads to unexpected behavior and potential performance issues.

# Example model setup
class Author(models.Model):
    name = models.CharField(max_length=100)

class Book(models.Model):
    title = models.CharField(max_length=200)
    author = models.ForeignKey(Author, on_delete=models.CASCADE, related_name='books')

# Prefetch the books for all authors
authors = Author.objects.all().prefetch_related('books')

# This uses the prefetched data - no additional query
books = authors[0].books.all()  # Uses prefetched data

# But using first() doesn't utilize the prefetch_related
first_book = authors[0].books.first()  # Triggers a new database query instead of using prefetched data

Expected Behavior

I would expect that when using first() on a reverse ForeignKey relationship manager from an object in a queryset with prefetch_related applied, it would use the already prefetched data, avoiding additional database queries.

Actual Behavior

When using first() on the reverse ForeignKey relationship manager, the prefetched relationships are not utilized, and it triggers a new database query to fetch the first related object.

Database Information

  • Django Version: 5.0.3
  • Database Engine: PostgreSQL 15
  • Python Version: 3.11

Additional Information

I’ve verified this behavior by examining the SQL queries logged during execution. The all() method properly utilizes the prefetched data, but first() issues a new query that should be unnecessary.

This behavior is inconsistent with how reverse ForeignKey relationship manager methods like all() work when the related objects have been prefetched.

Proposed Solution

The first() method on reverse ForeignKey relationship managers should respect and utilize data that has been prefetched via prefetch_related, ensuring that no additional database queries are needed when the data is already available.

The issue here is that the books were never ordered. first() is deterministic–and adds an implicit order_by('pk') if missing–and so if your books prefetch was not ordered, a trip to the database is made.

This is explained at more length here.

Two ways to order the related instances:

5 Likes

What a find, 10 years later. Just learned something new. Initially I thought that .first would always issue a query, no matter what.

Can we do the ordering in python if the queryset is already prefetched and the prefetched objects are not ordered. I am pretty sure getting the object with lowest pk from prefetched objects will be faster than calling the database.

Can we do the ordering in python if the queryset is already prefetched and the prefetched objects are not ordered.

It might be tempting but it would return the wrong results in many cases.

If we put aside the fact that first() will use the existing queryset ordering which can be composed of arbitrary complex database expressions and we only focus on the primary key falback it’s error prone to do without subtle breakages for non-integer based primary keys entirely on the Python side.

For example, a text-based primary key could make use of a collation that doesn’t follow the same ordering rules as what sorted(objs, lambda obj: obj.pk) would produce.

2 Likes