Backport policy when a crashing error is also a regression

jacobtylerwalls · December 27, 2024, 10:38pm

Our current practice around crashing errors–I take this to mean unhandled/undocumented Python exceptions in a supported workflow–is that if they’re regressions, we handle them under the regressions bucket of the backport policy, which subjects them to a how-recently-was-it-caused qualification, rather than the “crashing errors” bucket, which doesn’t.

Crashing errors are always either regressions or bugs in new features. The point of listing them separately from those two buckets, I take it, would be to clarify that a clock doesn’t apply. But our practice has been to apply the clock anyway.

A recent example is ticket-35596. I think this qualifies for a backport to the current stable/mainstream support version (5.1), since whether it was a regression in any certain Django version is orthogonal to whether a Python exception escapes.

Is that right?

PS – I’m more interested in the priniciple (and updating docs if necessary) and less in any particular ticket, including the above one.

ryanhiebert · December 28, 2024, 2:35am

I sympathize with your feeling here, and I’m not sure what the right answer is. We have to weigh the needs of the reviewers and fellows against what would be ideal for users. I think it might be helpful if we filled in some of specific answers to help us understand the costs associated with a policy that aligned better with your intuition.

How far back should reviewers or fellows be required to go test we just call a regression a bug fix since it’s been around long enough anyway?
Is there a bright line rule that’s appropriate for that boundary, if there is one?
Or if there’s an enterprising contributor willing to go spelunking enough into running old code to find a place where it works (or if it is easy to figure out where exactly the bug was introduced) does it just not matter how old the regression is, and maybe the rule only applies to reviewers and fellows as a default for when things are hard to determine?

carltongibson · December 28, 2024, 8:02am

It’s worth emphasising that the backport policy is for users, rather than reviewers and fellows. Folks have stable deployments, and every backport introduces the risk of regressions in those deployments. (This isn’t theoretical. In my time as fellow, we would have reports of breakages almost every time we backported anything.)

I think the policy is fine as it is. It’s stood the test of time well, and strikes a good balance. There are regularly cases where one wants to backport, and those are frustrating when one can’t, but we benefit from Django’s stability guarantees far more than this annoyance, I would/do maintain.

It’s always possible to make the case for a backport in the particular instance. I glanced at the particular issue. I wasn’t immediately sure. I’d usually defer to the fellows judgment in this kind of case.

ryanhiebert · December 28, 2024, 2:53pm

I appreciate that additional insight to the risks even of backporting fixes.

nessita · January 7, 2025, 5:02pm

Thank you @jacobtylerwalls for starting this conversation. There was a related backport request from @adamchainz in #36056 (Fix ignored exceptions in OutputWrapper.flush()) – Django.

Personally, I believe the change in question should not be backported, for the same reasons outlined by Carlton. The proposed change modifies production code to resolve an issue with test runs. I feel the potential risk of breaking production systems outweighs the benefit of addressing an exception in a test run when using pytest.

More generally, I think we need a clearer policy on what qualifies as a “crashing bug,” as well as the scope and limitations of such issues. This would help me, as a Fellow, make more informed decisions. Is this something that would fall under the purview of the @steering_council?

Lily-Foote · January 7, 2025, 5:12pm

I’m happy to add discussing the policy to our todo list.

In the specific case of #36056, my question would be can the exception cause a problem in a live system, or is it just a bit noisy? If the former, I think we should backport, otherwise I think we shouldn’t.

carltongibson · January 7, 2025, 5:12pm

I always thought “crashing bug” was quite clear: something that causes an unhandled error in handling actual web requests. (Contrast to a bug in test code, which clearly doesn’t.) It’s worth distinguishing single request vs do you crash the process? (Think of an error where you need to restart the development server, that you probably see all the time. That would clearly need a backport if it made it to a release.)

“Data loss” (the other one) seems clear enough too.

nessita · January 7, 2025, 5:25pm

Thanks @Lily-Foote, the answer is that l#36056 does not crash a live system, it’s only a bit noisy when using pytest on a test suite that exercises a management command. So I agree that this does not qualify for a backport.

adamchainz · January 8, 2025, 2:52pm

Thanks all for the input. In the end I agree it may not be worth the risk if the only effect is in tests. Theoretically there’s a path to affecting production systems too, given one can swap stdout/stderr when running a management command, but we can reconsider the decision if anyone reports it.

For those experiencing the issue on Django 5.1 or earlier, I wrote a blog post covering a way to silence the error.

I always thought “crashing bug” was quite clear: something that causes an unhandled error in handling actual web requests. (Contrast to a bug in test code, which clearly doesn’t.)

But management command crashes are valid too, right? If a crash prevented users from running legitimate tests, we’d backport that, right? I think some case-by-case analysis is always needed.

carltongibson · January 8, 2025, 2:56pm

I agree with that. As above…

No rule replaces the need for judgement.

Topic		Replies	Views
Request for Technical Board/Steering Council vote: Backport ticket 34063? Django Internals	19	1007	January 23, 2023
Supported releases and bug fixes Using Django	4	516	September 11, 2020
ImportError raised when try to load crispy_forms Using Django	9	9067	January 10, 2020
#30360 -- Support rotation of secret keys Mentorship	31	1729	January 25, 2021
Django migrations suddenly crash Using Django	3	1266	August 5, 2020

Backport policy when a crashing error is also a regression

Related topics