Ticket #25251 opinions required: Cloning the test DB as an alternative to rollback emulation in TransactionTestCase

shangxiao · March 24, 2024, 6:17am

Hi folks,

I was curious as to whether cloning the test db was a viable option to solving #25251. I did some experimentation and it works quite well for my use case. I also found it’s a lot faster than the rollback emulation that TransactionTestCase optionally does. Here’s my experiment with some basic speed comparison: stupid-django-tricks/clone_db_testcase at master · shangxiao/stupid-django-tricks · GitHub

Is it worth adding to TransactionTestCase? Draft PR to demonstrate: Fixed #25251 -- Added cloning option to TransactionTestCase by shangxiao · Pull Request #18009 · django/django · GitHub

Lily-Foote · March 24, 2024, 12:20pm

This sounds promising to me and I haven’t spotted any obvious problems with the approach. I would love to see #25251 solved.

shangxiao · March 24, 2024, 1:10pm

I should also point out that Markus had a solution to #25251 that reinserted the serialised data back into the database on teardown. This approach also worked for me but I wanted to see if we could do it without the current “slightly horrific” serialisation approach to rollback emulation.

charettes · March 26, 2024, 2:15am

That’s effectively very promising @shangxiao! I have to deal with the issue of TransactionTestCase loosing data created in data migration on a daily basis at work and I ended up recommending that all tests are adjusted to use patterns such as get_or_create to avoid assuming the data will be persisted.

In the PR you discuss that the approach could be sped up even more by creating a clone per TransactionTestCase to avoid loading fixtures multiple times and I think it would be worth doing. I guess it could even be done once per hash(fixtures) to be reused between multiple TransactionTestCase instances but that would make keeping track of the clone and disposing of them harder.

Something worth covering with more details I believe is all the test options added over the years to make TransactionTestCase faster that this could make obsolete.

For example, what should be done with reset_sequences? You mention that it doesn’t really apply but I think it does to a certain extent as both reset_sequences=True and False have to continue being supported or the option must be deprecated. For example, some tests out there might rely on reset_sequences=False and expect that no ID for a table is ever reused for the duration of the suite. Cloning breaks that. Not that it’s a blocker in itself but it’s worth pointing out. As for reset_sequences = True then the same approach you mentioned for fixtures could be used, create a per test class clone fix, reset the sequence on it, and use it as a clone base for each test method.

available_apps was effectively added in the first place to speed up TransactionTestCase to accommodate large Django apps with a lot of models (Anssi did if for the Django internal suite at first) so it’s fair to assume that it might no longer be necessary. It’s documented to do way more than that nowadays though so we have to ensure that we find a proper deprecation path here and document what exactly will change.

To summarize the approach seems promising particularly from a performance perspective (I’d appreciate if your benchmark were performed against all backends so we cleared view) but there are non-trivial deprecation questions that still need to be answered.

Topic		Replies	Views
database transaction commit issue within a test (3.1+) Using Django	0	511	March 29, 2021
TransactionTestCase.available_apps, allow_cascade=True - db cannot be flushed Forms & APIs	1	798	May 22, 2023
Shuffle databases for testing backends ORM	1	318	December 4, 2022
Using transactions Forms & APIs	0	213	March 30, 2023
Testing with legacy databases Using Django	2	883	June 18, 2020

Ticket #25251 opinions required: Cloning the test DB as an alternative to rollback emulation in TransactionTestCase

Related Topics