Tasks framework versatility & performance

Hi there :waving_hand:,

I hopped on the bandwagon pretty quickly after the django.tasks release, as common interface would make my life as a 3rd-party maintainer simpler. I found out about it pretty late, but I was so grateful that someone (Jake) mustered the energy for such a massive proposal.

However, even though I am working on multiple commercial Django 6.0 projects, not a single one has adopted tasks in half a year. Why?

So, I did what any sane person does and tried building the tools I was missing. First was django-crontask. I was already maintaining its Dramatiq sister project for years—easy transformation, right?

No, that’s when I discovered Django has introduced dataclasses. I love dataclasses, but since they have special metaclasses, inheritance is tricky. For an task framework aimed to be extended by the community, this felt like an odd choice. The feeling was amplified by the fact the dataclasses are frozen. I have yet to uncover why the dataclasses are frozen. Especially since they are not immutable like a namedtuple, and freezing comes at a known performance disadvantage: dataclasses — Data Classes — Python 3.14.4 documentation

Shifting to the TaskResult it’s even stranger. It’s a frozen (emulated immutable) dataclass, but we treat it as mutable. When you call TaskResult.refresh it updates the object in place (using object.__setattr__) instead of returning a new immutable object. There was a review comment about this but to my knowledge it was sadly never addressed. It’s especially odd to me, since Python has a native dataclasses.replace function, which correctly returns new dataclass instances and would have greatly simplified the Django implementation.

Now, let’s chat about efficiency for a minute. For Django’s tasks framework to be a good base, it must be versitile and enable performance. It should be able to handle a few tasks a day, just as good as a couple million per second. Luckily Django 6.1 will have Picklable tasks to support multiprocessing, nice!

Still, I’d like to propose a few more changes to improve both versitility as well as performance:

  1. Unfreeze Task: Dataclasses are a good choice here. Tasks are instanciated during module loading as quasi singletons. People can better use inheritance or even do in-place updates with decorators. Attribute read performance could be improved by slotting them.
  2. Use typing.NamedTuple for TaskResult and TaskContext: You may hold millions of those objects in memory if you trigger map-reduce tasks (or other bulk operations). Dataclasses are objects (with a __dict__ or __slots__), whereas named named tuples are C structs. They use less memory both during runtime or in transport while piping them to different processes. They are also actually immutable, so no more in place updates.
  3. Lazily reference TaskResult.task ( via import_string and a property): Currently every task result holds a reference to a task instance. This can lead to task instances (quasi singletons) being copied. This can create unwanted memory bandwidth and allocation overhead.
  4. Make task results comparable: Tasks have a priority (wonderful!), but don’t implement a comparison method they would need for Python’s native PriorityQueue. I would suggest a default priority LIFO order.

This is a difficult topic and a difficult read. I am genuinely impressed by the work that has been done. But being so complex, I believe it will take multiple iterations to reach a robust framework that best serves the community.

Best!
Joe

4 Likes

Amendment

  1. Drop call / acall: Currently both use asgiref (even if they run outside of an ASGI lifecycle) and are thread-aware (1 thread per execution).

I think people are not aware of the performance consequences of thread-aware sync_to_async calls. I believe developers should make an active and aware decision how to execute a task. IMHO, those functions provide little to no use cases/benefit over just calling Task.func directly but with added control.

A lot to unpack here, so here goes:

At present, unless django-tasks-db or django-tasks-rq work perfect for your use case, the onus on adopting django.tasks is more on library authors than it is project maintainers. The ecosystem needs some solid foundations first before people can start confidently building upon it. For some projects that’s django-tasks-db and django-tasks-rq, and for others it’s not - and that’s ok. I’m not surprised it’s taking time for projects to migrate to or build around django.tasks, but it’s definitely happening. If there are things the framework could do better (perhaps outside the below), please say - I’d love to have some conversations RealOrangeOne/django-tasks · Discussions · GitHub!

It’s not super complex, but I agree it’s not always obvious. django-tasks-db has a custom TaskResult which extends the base and it works just fine - all it needs is a dataclass decorator. I went with a dataclass since the problem at hand fit into their domain nicely, and beings with it a few shorthand niceties. I’m not at all tied to it, and if there’s good reason to convert to a conventional class - let’s! (deprecation details intentionally not discussed).

The main reason is to avoid the foot-gun of people mutating instances unnecessarily and expecting them to be committed. Since there’s currently no API to update results in place, it felt useful. However, it’s a problem people are aware of with ORM models, so again I’m not opposed to un-freezing. I’d suggest unfreezing Task and TaskResult together.

This isn’t quite true. Internally it’s mutated, namely so refresh functions as expected, and because in some cases it’s cleaner than passing everything into the constructor (sure, these might be solved by dropping dataclass). However externally, it’s intended to be considered read-only (subtly different to immutable). Pure immutability wasn’t really a design goal.

TaskResult.refresh was intended to mirror Model.refresh_from_db. You can still retrieve a clean updated instance with get_result, much like you can with the ORM. In practice, I doubt refresh will be called very often, since it assumes the TaskResult lives longer than the task takes to run. Since there’s already get_result, I don’t know that a method which returns a new instance on the TaskResult itself is useful - but I’m happy to be proven wrong.


Now for the changes:

  1. Unfreeze Task: I’m interested in discussing this further to gauge input. Issues · django/new-features · GitHub is probably the place to go for this. Again though, I’d suggest considering TaskResult in the same conversation, since the same merits will hold true.
  2. Use typing.NamedTuple for TaskResult and TaskContext: I don’t think a NamedTuple is right here. It leads to odd APIs which can be surprising to many. attrs has some great reading on this. With that said, I’d be interested in continuing discussions on replacing dataclasses with say a native class, especially if the cons outweigh the pros. new-features repo sounds like a good place for this discussion.
  3. Lazily reference TaskResult.task ( via import_string and a property): This sounds like a great idea to me. Having to import and instantiate the Task for each result is likely fairly expensive, duplicates instances, and is just generally unnecessary work in many cases. It’s probably not quite ticketable yet, so new-features is probably the place to go to gain some wider input on impact.
  4. Make task results comparable: This one is interesting to me - what value do you see in them being comparable? It’s more than just priority - in ORM speak it should be [F("priority").desc(), F("run_after").asc(), F("enqueued_at").asc()]. I’m not opposed to implementing that, but I’m not sure I see the value (at least wide enough to be implemented in core).
  5. Drop call / acall: These are intentionally not part of the public API, and exist as a shorthand to trigger the functions easily, without each callsite needing to consider whether the task is async or not. It’s short-hand, not the intended operation. If a worker needs to grab .func directly, they absolutely should. I know sync_to_async can cause problems, but so long as concurrency within a single thread isn’t too high (which the GIL sort of blocks anyway), the overhead should be minimal - especially since if you know everything is async, you can avoid acall entirely.

Comments like these are exactly what this project needs. I don’t want to be the only one designing django.tasks - it needs other people’s thoughts and experiences to be a useful and “robust” framework. Thanks for taking the time to formulate and write it!

5 Likes

Hi @theorangeone,

Thanks for taking your time for the detailed response and providing a little history deep dive.

I hope you can tell that I am a big fan of the work you did and tasks in Django. I think many would agree that Django hasn’t seen such a big new feature since ASGI. :1st_place_medal:

Community adoption

100%, it’s going to take a while, especially since there are very mature options like Celery.
That being said, I secretly dream of a day when we have a Redis queue in Django. :slight_smile:

Dataclasses

Again, I do like dataclasses, but since the tasks implementation is swappable, the inheritance quirks are real. My main concern is that you can’t override fields, e.g., with properties, but it’s a small concern.

This isn’t quite true. Internally it’s mutated, namely so refresh functions as expected, and because in some cases it’s cleaner than passing everything into the constructor (sure, these might be solved by dropping dataclass ). However externally, it’s intended to be considered read-only (subtly different to immutable). Pure immutability wasn’t really a design goal.

I see… hm… I don’t think “internally mutable” but “externally immutable” communicates clearly to users. Especially since refresh will update in place, which even for a user makes the object mutable and requires state management for users.

Maybe it’s helpful if we flip a coin and decide on either paradigm.

about the changes…

  1. Cool, I can open an issue there. I thought more about cleanup on Trac, but I am not in a rush.
  2. typing.NamedTuple: attrs distinction didn’t age well. They are typed now. I also disagree with other points, since they focus on usage, not Python’s internal implementation. A tuple is not an object, which makes a big difference for performance. And honestly, this is my only angel. I want the have a smaller memory footprint and quicker serialization. Both are strong suits of a tuple. Yes, they are actually immutable, but as you mentioned before, this might be useful.
  3. Same as 1., but no rush.
  4. This is mainly about providing a base implementation for the magic methods, since the base task does already provide a priority. 3rd party packages will of course need to expand on this.
  5. Oh, ignore my comment then. I thought they were public.

Performance receipts

I am working on a new queue-agnostic worker pool (*like billiard in Celery with a Gunicorn interface): GitHub - codingjoe/threadmill: A queue agnostic worker for Django's task framework. · GitHub
This involved numerous benchmarks. I will try to create some reproducible benchmarks that underline some of my performance concerns.

3 Likes