Release 4.2a: Async Querysets with prefetch_related

I’m exploring implementing top-to-bottom async views in our application. (Strictly speaking, I’m working on async logic to generate responses to websocket messages arriving via channels.) Just ran into the queryset limitation that the asynchronous iterator doesn’t support prefetching data.

After some thought and code reading, I overloaded QuerySet.aiterator() with this logic inspired by the synchronous iterator() method, and it seems to work reasonably well – data seems to be prefetched for one chunk at a time. Beyond the performance impact of crossing the sync border for every chunk of results, are there any other side effects that I should worry about?

The solution seemed so easy that I was surprised it wasn’t in 4.2 alpha build – so much so that I’m wondering what I must be overlooking.

(TLDR: The logic sets aside the list of prefetches, then uses the standard async iterator to assemble a chunk of results. The prefetches are then applied to that chunk, much like the standard iterator() method, and then the results are yielded one at a time.)

async def aiterator(self, chunk_size=2000):
    """
    An asynchronous iterator over the results from applying this QuerySet
    to the database.
    """
    from asgiref.sync import sync_to_async
    from django.db.models import prefetch_related_objects

    if not self._prefetch_related_lookups:
        async for result in super().aiterator(chunk_size=chunk_size):
            yield result

    if chunk_size <= 0:
        raise ValueError("Chunk size must be strictly positive.")

    prefetch_related_lookups = self._prefetch_related_lookups
    self._prefetch_related_lookups = None

    chunk = deque()
    num_items = 0

    async for result in super().aiterator(chunk_size=chunk_size):
        chunk.append(result)
        num_items += 1
        if num_items == chunk_size:
            await sync_to_async(prefetch_related_objects)(chunk, *prefetch_related_lookups)
            for item in chunk:
                yield item
            chunk.clear()
            num_items = 0

    if num_items:
        await sync_to_async(prefetch_related_objects)(chunk, *prefetch_related_lookups)
        for item in chunk:
            yield item

    self._prefetch_related_lookups = prefetch_related_lookups

:wave: hello there!

All Django development takes place in the open so a nice way to understand the rationale is to walk your way from the exception message you are encountering all the way to the Github PR that introduced it and it’s associated ticket using git blame or the Github blame interface.

Searching for prefetch on the Github PR leads to this conversation

TL;DR async support for QuerySet.aiterator was added around the same time support for prefetch_related was added to QuerySet.iterator so it was decided to not implement prefetching support at the time to merge async queryset support in time for 4.1 and it was forgotten from there :slight_smile:

I suggest you create a new feature ticket to add support for it and even submit a patch if you’re interested in doing so. At this point it will likely not make the cut for 4.2 though but it should hopefully make its way into 5.0.

2 Likes

Thank you, Simon!

I confess, I haven’t quite made the bridge from being a fairly good Django user to being a Django developer, and had not tried to chase the rationale through the various logs and tickets. I’ll try to do better next time .

I’ll enter a feature request and, if I can get the environment set up today, try to submit a patch. It would be my first submission.

baj

1 Like

May I ask if this is your solution, where did you put it ? I have the same issue at the moment