Sorry, I’m doing this backwards, because I already have a PR up, but it needs to have a thread here. It’s in some ways a small change that fixes a long-standing bug (ticket #25251), but it adds a command line option, so in that way it is a very big change.
The problem I’m running in to is that --keepdb doesn’t always work. I mean, it never deletes the database, so technically it always works. But it also keeps any changes done to the database during testing, so it’s a bit of a monkey’s paw. What could mutate a database while testing? TransactionTestCases, TestCases that delete (? still digging in to that), SimpleTestCases given database access (really), integration tests, honest mistakes, bugs, and that’s just my last week.
I think the root issue is that ticket #25251 isn’t really a bug: there are two valid and useful interpretations of “keep”. The bug is that Django only implements one (unfortunately, I believe the less useful one). The definition keepdb is using is “to stay or continue in”. It keeps the database where it is. But there’s also “preserve or maintain”. To keep the database what it is. If you have large test databases and a very stable test suite, then you might be willing to trade some reliability for speed and would prefer to keep your test database where it is. In many other situations, it’s better to trade a small bit of speed for the reliability and consistency of keeping the test database what it is.
That’s what I do with the --use-clones option. It’s just two small changes when used: sequential runs also use the parallel code path and clones are always overwritten (keepdb=False). That’s it. Simple, but a large change to the functionality and semantics and not something that can currently be accomplished with the existing flags. And in the case of any errors it does more to keep DB than keepdb.
Does it need to be in Django? I think so. My initial implementation was a custom test runner (includes some details of my specific problems), but it has to copy a chunk of logic out of test.utils.setup_databases, which is a good sign that it isn’t operating at the right layer of abstraction. And I really believe this is something from which a lot of people would benefit. If anything, I think it should be renamed --skip-copies and turned on by default: it has almost all of the performance gains of keepdb with none of the footguns. For beginners, it’s all upside and for users advanced enough to experience the performance downside, it’s an easy change.
(I’m joking, that would be too large of a change in expected behavior alone.)
Another option, given the functional overlap, would be to modify --keepdb to have profiles. So --keepdb and --keepdb fast would use the legacy behavior and --keepdb strict would use the --use-clones behavior. And that would enable options like --keepdb strict-clean, which could clean up the cloned databases after itself (maybe even only clones that had no test failures; not sure how accessible that information is from that layer🤔).
Any thoughts or feedback?