Ticket #25251 opinions required: Cloning the test DB as an alternative to rollback emulation in TransactionTestCase

Hi folks,

I was curious as to whether cloning the test db was a viable option to solving #25251. I did some experimentation and it works quite well for my use case. I also found it’s a lot faster than the rollback emulation that TransactionTestCase optionally does. Here’s my experiment with some basic speed comparison: stupid-django-tricks/clone_db_testcase at master · shangxiao/stupid-django-tricks · GitHub

Is it worth adding to TransactionTestCase? Draft PR to demonstrate: Fixed #25251 -- Added cloning option to TransactionTestCase by shangxiao · Pull Request #18009 · django/django · GitHub

1 Like

This sounds promising to me and I haven’t spotted any obvious problems with the approach. I would love to see #25251 solved.

1 Like

I should also point out that Markus had a solution to #25251 that reinserted the serialised data back into the database on teardown. This approach also worked for me but I wanted to see if we could do it without the current “slightly horrific” serialisation approach to rollback emulation.

That’s effectively very promising @shangxiao! I have to deal with the issue of TransactionTestCase loosing data created in data migration on a daily basis at work and I ended up recommending that all tests are adjusted to use patterns such as get_or_create to avoid assuming the data will be persisted.

In the PR you discuss that the approach could be sped up even more by creating a clone per TransactionTestCase to avoid loading fixtures multiple times and I think it would be worth doing. I guess it could even be done once per hash(fixtures) to be reused between multiple TransactionTestCase instances but that would make keeping track of the clone and disposing of them harder.

Something worth covering with more details I believe is all the test options added over the years to make TransactionTestCase faster that this could make obsolete.

For example, what should be done with reset_sequences? You mention that it doesn’t really apply but I think it does to a certain extent as both reset_sequences=True and False have to continue being supported or the option must be deprecated. For example, some tests out there might rely on reset_sequences=False and expect that no ID for a table is ever reused for the duration of the suite. Cloning breaks that. Not that it’s a blocker in itself but it’s worth pointing out. As for reset_sequences = True then the same approach you mentioned for fixtures could be used, create a per test class clone fix, reset the sequence on it, and use it as a clone base for each test method.

available_apps was effectively added in the first place to speed up TransactionTestCase to accommodate large Django apps with a lot of models (Anssi did if for the Django internal suite at first) so it’s fair to assume that it might no longer be necessary. It’s documented to do way more than that nowadays though so we have to ensure that we find a proper deprecation path here and document what exactly will change.

To summarize the approach seems promising particularly from a performance perspective (I’d appreciate if your benchmark were performed against all backends so we cleared view) but there are non-trivial deprecation questions that still need to be answered.

1 Like