Request for Technical Board/Steering Council vote: Backport ticket 34063?

The short summary of this is that I’m asking for a vote to allow a backport of the fix from ticket 34063 into all currently-supported feature releases of Django: Django 4.1, Django 4.0, and Django 3.2.

Full context, my reasoning for wanting this, and the exact issue to be voted on, follow below.

Preamble

Why does Django have a backport policy? It would, after all, be simpler from a maintenance perspective to just not fix bugs in older feature releases of Django, and instead require everyone to always upgrade to the latest feature release if they also want the latest bugfixes. Yet Django has a backport policy, and according to that policy certain types of fixes are backported into older feature releases. Doing this has many advantages, not least of which is that it promotes the image of Django as a stable, reliable framework that developers can use with confidence.

Yet that policy does not and cannot cover every conceivable case. Here we have a case that the policy did not foresee: a bug of such severity that nobody – as far as I am aware – disputes it would have been a release blocker if only it had been discovered early enough. It was not discovered earlier, though, and now it just misses on technicalities of several of the criteria in the backport policy. For example: it is a major functionality bug, but it is no longer a bug in a new feature of the latest stable release, and it causes a crash, but of your test suite rather than of Django itself.

When I asked on the django-developers list about making an exception to backport the fix for ticket 34063, though, I received a reply that cited the strict letter of the policy and seemed to admit no possibility of exceptions ever being granted.

Yet historically, exceptions have been granted from time to time. And not only have been granted, but have even led to changes in the backport policy. For example, once upon a time the policy did not allow for backporting fixes for regressions. That did not stop ticket 25548 from being backported; notably, in that case the backport policy was retroactively updated to cover the case of backporting a regression.

I continue to believe that a similar exception ought to be granted in the case of ticket 34063, and that after it has been backported, the text of the backport policy should be revisited to ensure that a case like this one is not excluded in the future.

The bug

The actual bug appears to have been introduced in Django 3.1, at exactly the same time that async view support, async middleware support, and async test support were initially added to Django.

The issue specifically is in django.test.AsyncClient and perhaps also in the associated django.test.AsyncRequestFactory (though I have not personally verified the bug in the case of the latter on its own): if any async view or async middleware, invoked as a result of a request from AsyncClient, happens to access request.POST, it will raise an exception with the message:

"AssertionError: Cannot read more than the available bytes from the HTTP incoming data."

The fix was quite small – only a few actual lines of code changed in django/test/client.py. The majority of the patch was retroactively adding basic functional tests that ought to have been included in the original addition of async support, and which would have caught this bug had they been included at the time.

The severity

The bug exists in all currently-supported releases of Django: 3.2, 4.0, and 4.1.

The bug is easy to trigger: if you use the built-in async test client as documented, and any async view or middleware invoked as a result accesses request.POST, you will get a crash with the above-mentioned error message.

The bug is hard to avoid: due to the way Django supports running both sync and async code side-by-side, there are few reliable ways to ensure async code paths are exercised in a test. Using the async test client is the most obvious way to do this and, as far as I am aware, the only officially reliable way.

The bug has a workaround: any view or middleware which wants to access request.POST in an async code path can first execute a throwaway acccess of request.body, after which reading request.POST will behave as expected.

But the workaround has several problems:

  • It is difficult to discover. I only found it after about a day of off-and-on debugging and research into a failing test in some of my own code. Search engines were not particularly helpful, even when given the full exact error message.

  • It is burdensome. All it takes to trip the bug is one async code path reading request.POST without first touching request.body, which means a developer needs to hunt for and find every such path in their code.

  • It is not completely effective. Because real-world use of Django often involves use of views and middlewares from Django itself, from the django.contrib applications, and from third-party applications, it simply may not be the case that all code paths are under the developer’s control and able to have the request.body workaround applied. Attempting to implement an “access request.body middleware” to universally apply the workaround is also unreliable, since both middlewares and some request attributes are sensitive to ordering issues; it is not too difficult to wind up in a situation where no order of middlewares/accesses avoids all potential problems.

Furthermore, if no fix for the bug is to be backported due to strict interpretation of the backport policy, then a similarly-strict interpretation would forbid adding any type of helpful note to the documentation for Django 3.2, 4.0, and 4.1, thus making it even more difficult for developers using Django to discover the cause of the bug or its workaround.

The case for backporting

As noted above, Django’s backport policy plays an important role in promoting Django’s image as a stable, reliable framework that can be used with confidence. Developers who use Django know that the project is responsive to bug reports, and that each feature release has a documented support period in which it can receive fixes for major bugs without forcing the developer onto a fast “upgrade treadmill” of feature releases.

Django is also going through an important period in its development history: the addition of support for async Python. There are numerous competing frameworks which already heavily advertise their full-async stacks or support, and which are growing in popularity on a daily basis. Django must keep up if it is to continue thriving.

Yet at the moment, using Django’s own documented async support, as documented, is significantly more difficult than it ought to be, since the primary documented mechanism for testing async code – the AsyncClient – has a bug of such severity that it would have been an obvious release blocker if found earlier, with a workaround that is both difficult to discover and difficult/burdensome (in some codebases, likely impossible) to apply effectively.

If the fix for AsyncClient is not backported into older feature releases, async-curious developers will be faced with a conundrum: async Django view and middleware code, despite being officially supported by the framework for some two and a half years at this point, will be effectively untestable (which, in many projects, is synonymous with “unusable”) for at least another four months, since it will not be until Django 4.2, in April 2023, that the Django project finally issues a release with a non-broken AsyncClient.

The situation will be even worse for maintainers of third-party applications and libraries: many of them adopt a policy of supporting whatever Django versions continue to receive upstream support from the Django project itself. This would push the earliest date of reliable AsyncClient availability out into the second quarter of 2024, which is when all currently-supported Django feature releases will finally drop out of support.

It was suggested in the original django-developers thread that maintainers of third-party libraries/applications should pre-emptively cut off Django versions prior to 4.2 if they want to provide async support. As a maintainer of several third-party Django applications, I would find that unacceptable – a major plus of the Django ecosystem is the way that maintainers tend to sync up their support cycles with Django itself whenever possible. Not to mention that I and everyone else who uses Django should be able to use and rely on documented APIs and features of Django as of the releases they are documented in.

It also has been suggested both in the django-developers thread, and on IRC, that async support in Django should be treated as more of a work in progress with lessened expectations of reliability, and/or that bugs in async support perhaps would not or should not be eligible for backporting even in cases when strict reading of the policy would support a backport, due to async support being in such a state of flux. This fails for multiple reasons:

  • It runs against the idea that Django is and should be a reliable framework – an idea that the backport policy is supposed to support and promote!

  • The documentation states that async support “for the ORM and other parts of Django” is still being worked on. But for the parts of async that have already landed, the reader is given no hint that these things might have been shipped and then knowingly left in a badly broken state. If this is such unstable code that it cannot be trusted and bugs – including some functionality just being outright broken – are to just be expected, then why are there not loud warnings about it in the documentation?

  • The existing async support – APIs, tools, and so on – is documented in a way which clearly places it under the Django API stability and backporting policies. Either those policies apply as written, or they do not. If they do, then the framework owes the support and stability guaranteed by those policies. If they do not, then the policies cannot be used as strict letter-of-the-law rules which admit of no exceptions.

In fact, it seems to me that if policies are to be relaxed on the stability and reliability of async in Django, then that relaxation must also affect the backport policy, and allow for these kinds of severe but late discovered bugs, when they crop up, to be backported to all supported releases containing the bugged feature.

Finally: leaving a bug of this severity, in a part of Django that has been hyped for the past several releases, deliberately and knowingly unfixed in all the currently “supported” Django releases, would be a significant breach of the trust and goodwill Django enjoys from its community and ecosystem. I have repeatedly emphasized Django’s reputation for reliability and stability. Such reputations take years to build, but can be lost in an instant.

The maintenance burden of backporting the AsyncClient fix appears to me to be minimal; as noted above, only a handful of actual lines of code changed in django/test/client.py. The impact of not backporting it seems large: it acts as something like a “missing stair” for async Django in that the bug is of high severity and easy to to trip over once you begin writing and testing async code, but difficult to find out about before you do trip over it (and can be difficult to find a solution for afterward). This will hinder and fragment adoption of async Python in the Django community and ecosystem in exchange for no clear gain that I can see on maintainability of Django itself, and will also hurt the reputation Django currently enjoys as a reliable and stable framework with good support policies.

In light of which, I strongly urge that you override the objections from the django-developers thread. I put before you the following:

Shall it be directed:

  • That backport of ticket 34063 SHALL be allowed into each of Django 3.2, Django 4.0, and Django 4.1,
  • That in light of such permission, the Mergers MUST NOT use the backport policy as a reason to refuse merging of such a backport,
  • That the Releasers SHALL issue new bugfix releases of Django 3.2, Django 4.0, and Django 4.1, containing the backported fix of ticket 34063,
  • That after the backport is complete and merged into each of Django 3.2, Django 4.0, and Django 4.1, the backport policy SHALL be revisited, discussed, and revised as needed to cover cases such as the case of ticket 34063, in which backporting is worthwhile but in which backporting is not presently, by a strict reading of the policy, permitted.

I am qualifying this as you asking for a Minor Change vote, James, and I personally believe that you have not allowed sufficient time for either consensus or lack of consensus to be reached, as the thread on django-developers was started only 17 hours ago. I didn’t even get time to read and digest the thread from my inbox yet.

I will summon the rest of the @steering_council here via mention, but I believe we should decline to vote on this issue until there’s been at least a few more people contributing to the discussion, or the existing discussion has clearly finished and a few days have passed.

Although it is only one person replying, that person is a Merger and seems quite adamantly against the proposal.

Since the only options for getting backports merged other than “convince a Merger” (seemingly not going to go anywhere) are “spin it as a security issue that the Security Team can order merged on its own initiative” (not applicable) or “convince the Steering Council”, and since the Merger in question suggested bringing it to the Steering Council, I think it’s a live issue suitable for a vote.

Wowser, I can’t escape this wherever I go :sweat_smile:

Oh well…

For what it’s worth my objection here is not to exceptions to the backport policy being made, but rather than this issue justifies it.

The issue means that accessing request.post on ASGIRequest objects created by AsyncRequestFactory and AsyncClient will error if the body size is less that 65KB, unless request.body in read first, in which case a different code-path comes into play. It’s only the testing tooling that’s affected.

This is fixed in main and will be part of Django 4.2.

For AsyncRequestFactory one can simply add a request.body in the test case if needed.

This bug likely does mean the AsyncClient isn’t really usable for many tests before Django 4.2.

Bottom-line: the sync Client is still available, to test your async views and middleware. I don’t see that needing an async test, where using the lower level request factory isn’t viable, is so pressing as to require a well-passed term backport. That it took two releases to even see, and another one before we got here shows that I’d argue.

I think the MUST NOTs and SHALL in the proposal here are unnecessary, but
whatever. As the Steering Council think.

I’m going to mute this thread too, until after the new year.
Happy festivities all! :partying_face:

Kind Regards,

Carlton

Hi folks,

this is a lot to digest and I’d like to ask everyone involved to stay considerate. For one it is holiday season and for the other we are all just humans (with flaws).

Next, I think our fellows(/mergers) are doing a great job. Our backporting policy is relatively strict to keep our sanity. As a result of that the fellows/mergers are usually also the first ones to receive complaints because they are the ones saying no – and everyone has their pet issue which they like to see backported over potentially every supported release.

On a procedural level I agree with Andrew. The job of the steering council is to provide a tie breaking vote when our other decision making processes fail. It is true that Carlton suggested to take it up with the steering council, but I think this was mostly his way of saying that this decision isn’t necessarily his to make (please correct me if I am wrong Carlton), I am sure Carlton is fine to backport the patch if that is the result of the developers discussion and we surely won’t need a vote for this.

Regarding voting on this: As Andrew, I do not think that our decision making process has failed on a level that would allow us to vote on this already. On a more personal level I am saddened that you think Django has reached a level where we would require a steering council vote in less than 24 hours (and I think even a week would be a short minimum) after opening a thread and receiving answers from one person. Is this really how we want to run Django?

A bit more on the bullet points you want us to vote on: I think point two is completely unnecessary and feels close to a mockery of the process. Is there any indication somewhere that you think that after a positive vote on point one the mergers would use the backport policy as a shield to refuse such a backport (I do not think this would be possible btw since the vote on point one would have to take the backports policy into account anyways, otherwise there would be no reason to vote)? As for point three, releases should follow the usual process. Even “normal” release-blockers do not trigger immediate releases and there is no reason for this bug to be special.

I do not think there is much point to vote on point four either, there is simply no need to vote on starting a revisitation/discussion of the backport policy. This can always be started via a thread on this forum or developers and only if we fail to reach consensus there, then we should need to vote on it.

Let’s talk about the bug itself: It is true that this bug is hard to find via Google at least but putting the error message into Trac’s search though immediately finds it. I also do think it is easy to work around, many projects already have compat.py modules for different Django versions and I think a subclass of AsyncClient would do this nicely (I haven’t tested this, but I do not see why this wouldn’t work after glancing over the fix) [1].

As for backporting the fix itself I am somewhat torn. If an easy workaround exists then we shouldn’t have to backport. On the other hand it seems people are now starting to use async Django and begin to run into those issues – this is great, otherwise we wouldn’t know if anyone is using it. I’d be inclined to make an exception here for 4.1 and backport it there to have at least one version that would allow the usage of AsyncClient before 4.2 is released. I honestly do not see much point in backporting this to the latest LTS given that it took way over a year for this to be discovered.

Cheers,
Florian

EDIT://
[1]: And Adam did beat me to it: https://groups.google.com/g/django-developers/c/Sl4bL4r0Kes/m/gn89yIiDBgAJ – working around this seems to be doable easily.

Is this really how we want to run Django?

I explained why I posted here. I do not feel that this level of scolding is necessary for someone merely trying to use the DEP 10 processes.

I think point two is completely unnecessary and feels close to a mockery of the process.

Given what has happened in the past with the SC interpreting my formal prose, I thought it best to be as precise and detailed as I could manage in explaining what I was asking the members to vote on, and to try to head off unintended consequences.

I’d be inclined to make an exception here for 4.1 and backport it there to have at least one version that would allow the usage of AsyncClient before 4.2 is released. I honestly do not see much point in backporting this to the latest LTS given that it took way over a year for this to be discovered.

I do not believe there is a consistent position available which makes an exception to backport only to a subset of currently-released versions of Django. And as I have noted, the fact that the Django ecosystem tends to follow upstream Django means that refusing to backport all the way to 3.2 would either fracture the ecosystem – by forcing some maintainers to cut support early when Django 3.2 still has over a year of upstream support to go – or further delay the widespread usability of async Django. Neither of these options is palatable. Hence: the backport, if it happens, needs to happen in all supported versions: Django 4.1, Django 4.0, Django 3.2.

If an easy workaround exists

I have already replied to Adam on django-developers pointing out the issues with his proposed workaround.

I am posting a quick reply here to remind everyone that it is useful to follow the ongoing django-developers thread, because it has now been pointed out there that this bug was actually reported, in the Django Trac, shortly after the release of Django 3.1 (so within the window when the backport policy would have said to apply the fix in 3.1). Unfortunately, the report was closed as invalid at the time, and it was another two years before a report of the same bug was accepted and actually fixed.

I think there won’t be any more discussion on the forum, so if you still want a vote I guess now would be the time to request one. If you still want a vote, do you want to vote on the 4 bullet points from your initial post or do you want to refine them given the discussion here and on the mailing list?

As I read it, the following people are in favor of at least some level of backporting of this fix:

  • I am (obviously) in favor of doing the backport to all of: 4.1, 4.0, 3.2.
  • Kevin Grinberg in the django-developers thread seemed to be in favor of it as well.
  • Matthew Pava in the django-developers thread seemed to be in favor of it.
  • Shai Berger in the django-developers thread seemed to be in favor of it, but would prefer to settle new language for the backport policy since that potentially makes the whole debate about this specific ticket moot by introducing a new/redefined category that it would fall into and that would be eligible for backport.
  • Florian Apolloner, above, seems to be in favor of a partial backport (only to 4.1, not to 4.0 or 3.2).

And the following people are against:

  • Carlton Gibson, both above and in the django-developers thread, seemed to be strongly against any backport on grounds of both the backport policy and concerns about maintenance.
  • Tim Graham, in the django-developers thread, seemed to be strongly against any backport, on similar grounds as Carlton.
  • Adam Johnson, in the django-developers thread, seemed to be against any backport, on grounds that an exception to the backport policy was not warranted for this case.

Or, summarized numerically: 4 in favor of the “full” (4.1, 4.0, 3.2) backport, 1 in favor of partial (4.1 only), 3 against any backport. The SC should now decide whether this represents a decision made via discussion (since there is a single position favored by a majority of participants, albeit a slight one), or a lack of consensus requiring an SC decision.

The SC also seems to prefer that the question be re-worded. So I will re-word it as follows:

Shall it be directed:

  • That, the wording of Django’s backport policy notwithstanding, backporting of the fix for Django ticket 34063 SHALL be allowed into each of the following release branches: stable/4.1.x, stable/4.0.x, and stable/3.2.x,
  • That the Mergers are not required to develop or assist in the development of patches toward such backport, but MAY do so at their discretion, and SHALL merge such patches, if they be provided by any party in good standing to contribute to Django and such patches otherwise conform to the generally-accepted standards of contributions to Django (such as passing the automated checks run on pull requests),
  • That the backported fix SHALL be included by the Releasers in the next release of each such branch of Django to be issued after such time as the patches are merged.

I believe that does not represent consensus and that the SC should now vote on this as a Minor Change - thanks for keeping the discussion going and getting some good opinions across the board.

My vote is 0 - while I do agree that the ticket is annoying to not have fixed in earlier versions, and would be generally supportive of it being backported if it developed without Merger time and can be proven it has no further maintenance burden (which I believe it does not in this case), I specifically do not agree that mergers SHALL be required to merge patches in the manner presented in the question, and ultimately it should be their call in most cases.

As a point of order, this stance is not compatible with DEP 10.

The Mergers can, in their individual capacities, take part in technical discussions just as anyone else can. But in their official capacities as Mergers their only decision-making authority is to make process decisions such as “is this patch a Minor Change which achieved consensus in favor of merging”. They are explicitly not to make or enforce official technical decisions such as “is this patch a good idea for Django”.

Such technical decisions are to be steered by the community, with this group as a backstop in the event consensus is not reached through normal community processes. As DEP 10 states, this group has the power to “make a binding decision regarding any question of a technical change to Django.” The question at hand is one of a technical change to Django – specifically, whether a particular Minor Change, that failed to achieve consensus, should be adopted. If you feel that it should be adopted, then directing the Mergers to merge it is an unavoidable consequence.

This has not been a problem in any of the multiple prior instances where, under DEP 10, the Mergers have been obligated to merge code into Django. For example, this happens every time a security patch is finalized. The Mergers actually do not have a choice in that because, again, their official decision-making authority is limited to process decisions – at most they could dispute that a patch had actually been authorized by the security team, but once a patch is clearly authorized by the security team, the Mergers must merge it.

The only difference between those cases and this one is that the security team has less formal structure around the details of its operations; the question put to you above is, in effect, the same thing the security team is ordering every time they sign off on a security patch. It is only in the case of the DEP 10 technical decision-making process that it needs to be put into such clear and explicit formal terms, in order to ensure everyone understands what is being voted on and what will occur as a result of the vote. And it is likely that future votes could use more streamlined language, but for this – which I believe is the first binding vote under DEP 10 that potentially would have a direct result of a Minor Change being merged into Django – I think it is worth being absolutely clear about the process and doing everything as “by the book” as possible.

And more broadly, merging code into Django is supposed to be a mechanical action under DEP 10. Either a given patch has cleared the process bar to be merged, in which case it gets merged, or it has not, and it does not get merged (or does not yet get merged). Once again the Mergers, in their capacities as Mergers, do not have the authority to make merge decisions on any other grounds.

This is the very essence of the DEP 10 model: those who have the authority to merge code into Django explicitly do not and must not also have the authority to make official technical decisions about which code is or is not good to have in Django. So your statement above simply is not compatible with DEP 10. Even if the explicit “manner presented” were changed, a vote to adopt this Minor Change, or any other future vote to adopt a Minor Change, would have the binding effect of requiring the Mergers to merge it.

You can, of course, still vote any way you like and for any reason you like – that is your prerogative – but I would still ask, as the author of DEP 10, that you show it more respect than it has received thus far.

The reason for my vote is that, if the wording as you presented it was accepted, any backport patch that passed our general standards bar would have to be accepted, and this is not a technical direction decision that I in my role as an SC member wish to support. I don’t mind if such patches have to go to community discussion - but I do not believe any patch that turns up and passes our automated checks should have to be merged.

If the patches were provided in advance of the vote, so that the issue to vote on became whether to merge those exact specific patches, would that satisfy you? Would there be any other objections or obstructions that would be helpful to know about in advance of preparing such patches and requesting such a vote?

I think voting on specific patches is far more tenable, yes.

I can put together patches, but likely will not have the time to do so until this weekend. Then I suppose we will try round three of this.

Actually, I am not going to put in the effort unless I have some assurance that it will be useful. So let us begin by having a vote on the following.

Shall it be directed:

If and only if this passes, I will work up patches and then ask for a followup vote specifically approving those patches and directing that they be merged.

It is interesting that we do have a process for voting but none to what should happen if the vote changes or is withdrawn – guess noone thought of that :slight_smile:

For the matter at hand (or to me more precise, I am voting on Request for Technical Board/Steering Council vote: Backport ticket 34063? - #16 by ubernostrum), this is my resolution:

I am torn, but my final vote on this matter is 0.

As James has pointed out, I am not against a backport in general and personally would support it to 4.1 It was pointed out though that there might not be a consistent position to support a backport solely to 4.1. I tend to agree. As a developer I would even be fine to backport it to all releases, especially if it affects myself. After all it is just a few lines and a couple of releases. And this is where the slippery slope for me begins. While the bug was probably first discovered relatively early on (#32189 (test.client.AsyncClient request POST failing) – Django), there was not much chatter about it since. So as much as I’d like to spin this as critical issue that hampers async adoption in Django, I simply cannot. There have been other tickets where people where asking/complaining loudly about a backport. But since this issue does imo not raise to such a level of criticality or importance I cannot say +1 on a backport. I also couldn’t say +1 if there merge requests where ready to go and James would cut the releases (which he could as Releaser) because this brings us into a weird position where Mergers could ask the steering council for votes on backports every time with the argument that they would be doing all the work (whereas others don’t even have that option).

@steering_council May I ask you to repeat your final thoughts here so we have on consistent place about it and put a formal vote here till tomorrow.

I think it’s fine for the original requestor of a vote to withdraw it.

On the specific topic of Request for Technical Board/Steering Council vote: Backport ticket 34063? - #16 by ubernostrum, my vote is 0, as while I think the fact that this is purely in the testing framework offsets my concerns about follow-on maintenance, it’s also outside our backport policy and I do not want to set a precedent.

@adamchainz @orf @charettes Can you please cast your votes here?

My vote is a -1.

I don’t see a strong reason to backport this change over the dozen other late-for-backport regressions we’ve seen over the years that affect the framework (e.g one that broke usage of StreamingHttpResponse with ASGI and will only be fixed in 4.2) without setting a precedent to more similar appeal in the future.

I’d be open to discuss amendment to the policy that allows for a formal application process for exceptional backports to take place but I’m not convinced an exception should be allowed for the proposed changes given the current state of things.