OK. I ran 4 cases (one of them twice). Interestingly, the first time I ran the 1 case where it repeatedly has run consistently slowly, it ran fast. When I repeated the test, it again ran slowly.
Here is the code for the test I ran. (I just put back 2 print statements, compared to what I pasted above.):
@override_settings(DEBUG=True)
def test_labeled_elements_common_with_compound(self):
"""
Test to ensure count 0 entries are not created when measured compound doesn't have that element
"""
self.load_multilabel_data()
call_command(
"load_accucor_msruns",
accucor_file="DataRepo/example_data/obob_fasted_glc_lac_gln_ala_multiple_labels/"
"alafasted_cor.xlsx",
protocol="Default",
date="2021-04-29",
researcher="Xianfeng Zeng",
new_researcher=False,
isocorr_format=True,
skip_cache_updates=True,
)
pg = (
PeakGroup.objects.filter(msrun__sample__name="xzl5_panc")
.filter(name__exact="serine")
.filter(
peak_annotation_file__filename="alafasted_cor.xlsx",
)
.order_by("id", "peak_data__labels__element")
.distinct("id", "peak_data__labels__element")
)
self.assertEqual(pg.count(), 2)
print("TEST CARBON COUNT")
self.assertEqual(pg.filter(peak_data__labels__element__exact="C").count(), 1)
print("TEST NITROGEN COUNT")
self.assertEqual(pg.filter(peak_data__labels__element__exact="N").count(), 1)
I changed my .env
to indicate either DATABASE_PORT=5433
for postgres 13 or DATABASE_PORT=5432
for postgres 10 and I edited the requirements to indicate either Django==4.2.4
or Django==3.2.20
and did python -m pip install -r requirements/dev.txt
to use different Django versions.
And for completeness, I set this in the settings:
LOGGING = {
"version": 1,
"filters": {
"require_debug_true": {
"()": "django.utils.log.RequireDebugTrue",
}
},
"handlers": {
"console": {
"level": "DEBUG",
"filters": ["require_debug_true"],
"class": "logging.StreamHandler",
}
},
"loggers": {
"django.db.backends": {
"level": "DEBUG",
"handlers": ["console"],
}
},
}
Results
Case 1 (FAST) Django 4.2, postgres 13
python manage.py test -k test_labeled_elements_common_with_compound
TEST CARBON COUNT
(0.001) SELECT COUNT(*) FROM (SELECT DISTINCT ON ("DataRepo_peakgroup"."id", "DataRepo_peakdatalabel"."element") "DataRepo_peakgroup"."id" AS "col1", "DataRepo_peakgroup"."name" AS "col2", "DataRepo_peakgroup"."formula" AS "col3", "DataRepo_peakgroup"."msrun_id" AS "col4", "DataRepo_peakgroup"."peak_annotation_file_id" AS "col5" FROM "DataRepo_peakgroup" INNER JOIN "DataRepo_msrun" ON ("DataRepo_peakgroup"."msrun_id" = "DataRepo_msrun"."id") INNER JOIN "DataRepo_sample" ON ("DataRepo_msrun"."sample_id" = "DataRepo_sample"."id") INNER JOIN "DataRepo_archivefile" ON ("DataRepo_peakgroup"."peak_annotation_file_id" = "DataRepo_archivefile"."id") INNER JOIN "DataRepo_peakdata" ON ("DataRepo_peakgroup"."id" = "DataRepo_peakdata"."peak_group_id") INNER JOIN "DataRepo_peakdatalabel" ON ("DataRepo_peakdata"."id" = "DataRepo_peakdatalabel"."peak_data_id") WHERE ("DataRepo_sample"."name" = 'xzl5_panc' AND "DataRepo_peakgroup"."name" = 'serine' AND "DataRepo_archivefile"."filename" = 'alafasted_cor.xlsx' AND "DataRepo_peakdatalabel"."element" = 'C') ORDER BY "DataRepo_peakgroup"."id" ASC, "DataRepo_peakdatalabel"."element" ASC) subquery; args=('xzl5_panc', 'serine', 'alafasted_cor.xlsx', 'C'); alias=default
TEST NITROGEN COUNT
(0.002) SELECT COUNT(*) FROM (SELECT DISTINCT ON ("DataRepo_peakgroup"."id", "DataRepo_peakdatalabel"."element") "DataRepo_peakgroup"."id" AS "col1", "DataRepo_peakgroup"."name" AS "col2", "DataRepo_peakgroup"."formula" AS "col3", "DataRepo_peakgroup"."msrun_id" AS "col4", "DataRepo_peakgroup"."peak_annotation_file_id" AS "col5" FROM "DataRepo_peakgroup" INNER JOIN "DataRepo_msrun" ON ("DataRepo_peakgroup"."msrun_id" = "DataRepo_msrun"."id") INNER JOIN "DataRepo_sample" ON ("DataRepo_msrun"."sample_id" = "DataRepo_sample"."id") INNER JOIN "DataRepo_archivefile" ON ("DataRepo_peakgroup"."peak_annotation_file_id" = "DataRepo_archivefile"."id") INNER JOIN "DataRepo_peakdata" ON ("DataRepo_peakgroup"."id" = "DataRepo_peakdata"."peak_group_id") INNER JOIN "DataRepo_peakdatalabel" ON ("DataRepo_peakdata"."id" = "DataRepo_peakdatalabel"."peak_data_id") WHERE ("DataRepo_sample"."name" = 'xzl5_panc' AND "DataRepo_peakgroup"."name" = 'serine' AND "DataRepo_archivefile"."filename" = 'alafasted_cor.xlsx' AND "DataRepo_peakdatalabel"."element" = 'N') ORDER BY "DataRepo_peakgroup"."id" ASC, "DataRepo_peakdatalabel"."element" ASC) subquery; args=('xzl5_panc', 'serine', 'alafasted_cor.xlsx', 'N'); alias=default
Case 2 (UNEXPECTEDLY FAST) Django 4.2, postgres 13
python manage.py test
TEST CARBON COUNT
(0.020) SELECT COUNT(*) FROM (SELECT DISTINCT ON ("DataRepo_peakgroup"."id", "DataRepo_peakdatalabel"."element") "DataRepo_peakgroup"."id" AS "col1", "DataRepo_peakgroup"."name" AS "col2", "DataRepo_peakgroup"."formula" AS "col3", "DataRepo_peakgroup"."msrun_id" AS "col4", "DataRepo_peakgroup"."peak_annotation_file_id" AS "col5" FROM "DataRepo_peakgroup" INNER JOIN "DataRepo_msrun" ON ("DataRepo_peakgroup"."msrun_id" = "DataRepo_msrun"."id") INNER JOIN "DataRepo_sample" ON ("DataRepo_msrun"."sample_id" = "DataRepo_sample"."id") INNER JOIN "DataRepo_archivefile" ON ("DataRepo_peakgroup"."peak_annotation_file_id" = "DataRepo_archivefile"."id") INNER JOIN "DataRepo_peakdata" ON ("DataRepo_peakgroup"."id" = "DataRepo_peakdata"."peak_group_id") INNER JOIN "DataRepo_peakdatalabel" ON ("DataRepo_peakdata"."id" = "DataRepo_peakdatalabel"."peak_data_id") WHERE ("DataRepo_sample"."name" = 'xzl5_panc' AND "DataRepo_peakgroup"."name" = 'serine' AND "DataRepo_archivefile"."filename" = 'alafasted_cor.xlsx' AND "DataRepo_peakdatalabel"."element" = 'C') ORDER BY "DataRepo_peakgroup"."id" ASC, "DataRepo_peakdatalabel"."element" ASC) subquery; args=('xzl5_panc', 'serine', 'alafasted_cor.xlsx', 'C'); alias=default
TEST NITROGEN COUNT
(0.021) SELECT COUNT(*) FROM (SELECT DISTINCT ON ("DataRepo_peakgroup"."id", "DataRepo_peakdatalabel"."element") "DataRepo_peakgroup"."id" AS "col1", "DataRepo_peakgroup"."name" AS "col2", "DataRepo_peakgroup"."formula" AS "col3", "DataRepo_peakgroup"."msrun_id" AS "col4", "DataRepo_peakgroup"."peak_annotation_file_id" AS "col5" FROM "DataRepo_peakgroup" INNER JOIN "DataRepo_msrun" ON ("DataRepo_peakgroup"."msrun_id" = "DataRepo_msrun"."id") INNER JOIN "DataRepo_sample" ON ("DataRepo_msrun"."sample_id" = "DataRepo_sample"."id") INNER JOIN "DataRepo_archivefile" ON ("DataRepo_peakgroup"."peak_annotation_file_id" = "DataRepo_archivefile"."id") INNER JOIN "DataRepo_peakdata" ON ("DataRepo_peakgroup"."id" = "DataRepo_peakdata"."peak_group_id") INNER JOIN "DataRepo_peakdatalabel" ON ("DataRepo_peakdata"."id" = "DataRepo_peakdatalabel"."peak_data_id") WHERE ("DataRepo_sample"."name" = 'xzl5_panc' AND "DataRepo_peakgroup"."name" = 'serine' AND "DataRepo_archivefile"."filename" = 'alafasted_cor.xlsx' AND "DataRepo_peakdatalabel"."element" = 'N') ORDER BY "DataRepo_peakgroup"."id" ASC, "DataRepo_peakdatalabel"."element" ASC) subquery; args=('xzl5_panc', 'serine', 'alafasted_cor.xlsx', 'N'); alias=default
Repeated Case 2 exactly (SLOW)
TEST CARBON COUNT
(147.730) SELECT COUNT(*) FROM (SELECT DISTINCT ON ("DataRepo_peakgroup"."id", "DataRepo_peakdatalabel"."element") "DataRepo_peakgroup"."id" AS "col1", "DataRepo_peakgroup"."name" AS "col2", "DataRepo_peakgroup"."formula" AS "col3", "DataRepo_peakgroup"."msrun_id" AS "col4", "DataRepo_peakgroup"."peak_annotation_file_id" AS "col5" FROM "DataRepo_peakgroup" INNER JOIN "DataRepo_msrun" ON ("DataRepo_peakgroup"."msrun_id" = "DataRepo_msrun"."id") INNER JOIN "DataRepo_sample" ON ("DataRepo_msrun"."sample_id" = "DataRepo_sample"."id") INNER JOIN "DataRepo_archivefile" ON ("DataRepo_peakgroup"."peak_annotation_file_id" = "DataRepo_archivefile"."id") INNER JOIN "DataRepo_peakdata" ON ("DataRepo_peakgroup"."id" = "DataRepo_peakdata"."peak_group_id") INNER JOIN "DataRepo_peakdatalabel" ON ("DataRepo_peakdata"."id" = "DataRepo_peakdatalabel"."peak_data_id") WHERE ("DataRepo_sample"."name" = 'xzl5_panc' AND "DataRepo_peakgroup"."name" = 'serine' AND "DataRepo_archivefile"."filename" = 'alafasted_cor.xlsx' AND "DataRepo_peakdatalabel"."element" = 'C') ORDER BY "DataRepo_peakgroup"."id" ASC, "DataRepo_peakdatalabel"."element" ASC) subquery; args=('xzl5_panc', 'serine', 'alafasted_cor.xlsx', 'C'); alias=default
TEST NITROGEN COUNT
(145.653) SELECT COUNT(*) FROM (SELECT DISTINCT ON ("DataRepo_peakgroup"."id", "DataRepo_peakdatalabel"."element") "DataRepo_peakgroup"."id" AS "col1", "DataRepo_peakgroup"."name" AS "col2", "DataRepo_peakgroup"."formula" AS "col3", "DataRepo_peakgroup"."msrun_id" AS "col4", "DataRepo_peakgroup"."peak_annotation_file_id" AS "col5" FROM "DataRepo_peakgroup" INNER JOIN "DataRepo_msrun" ON ("DataRepo_peakgroup"."msrun_id" = "DataRepo_msrun"."id") INNER JOIN "DataRepo_sample" ON ("DataRepo_msrun"."sample_id" = "DataRepo_sample"."id") INNER JOIN "DataRepo_archivefile" ON ("DataRepo_peakgroup"."peak_annotation_file_id" = "DataRepo_archivefile"."id") INNER JOIN "DataRepo_peakdata" ON ("DataRepo_peakgroup"."id" = "DataRepo_peakdata"."peak_group_id") INNER JOIN "DataRepo_peakdatalabel" ON ("DataRepo_peakdata"."id" = "DataRepo_peakdatalabel"."peak_data_id") WHERE ("DataRepo_sample"."name" = 'xzl5_panc' AND "DataRepo_peakgroup"."name" = 'serine' AND "DataRepo_archivefile"."filename" = 'alafasted_cor.xlsx' AND "DataRepo_peakdatalabel"."element" = 'N') ORDER BY "DataRepo_peakgroup"."id" ASC, "DataRepo_peakdatalabel"."element" ASC) subquery; args=('xzl5_panc', 'serine', 'alafasted_cor.xlsx', 'N'); alias=default
Case 3 (FAST) Django 3.2, postgres 10
python manage.py test -k test_labeled_elements_common_with_compound
TEST CARBON COUNT
(0.001) SELECT COUNT(*) FROM (SELECT DISTINCT ON ("DataRepo_peakgroup"."id", "DataRepo_peakdatalabel"."element") "DataRepo_peakgroup"."id" AS Col1, "DataRepo_peakgroup"."name" AS Col2, "DataRepo_peakgroup"."formula" AS Col3, "DataRepo_peakgroup"."msrun_id" AS Col4, "DataRepo_peakgroup"."peak_annotation_file_id" AS Col5 FROM "DataRepo_peakgroup" INNER JOIN "DataRepo_msrun" ON ("DataRepo_peakgroup"."msrun_id" = "DataRepo_msrun"."id") INNER JOIN "DataRepo_sample" ON ("DataRepo_msrun"."sample_id" = "DataRepo_sample"."id") INNER JOIN "DataRepo_archivefile" ON ("DataRepo_peakgroup"."peak_annotation_file_id" = "DataRepo_archivefile"."id") INNER JOIN "DataRepo_peakdata" ON ("DataRepo_peakgroup"."id" = "DataRepo_peakdata"."peak_group_id") INNER JOIN "DataRepo_peakdatalabel" ON ("DataRepo_peakdata"."id" = "DataRepo_peakdatalabel"."peak_data_id") WHERE ("DataRepo_sample"."name" = 'xzl5_panc' AND "DataRepo_peakgroup"."name" = 'serine' AND "DataRepo_archivefile"."filename" = 'alafasted_cor.xlsx' AND "DataRepo_peakdatalabel"."element" = 'C') ORDER BY "DataRepo_peakgroup"."id" ASC, "DataRepo_peakdatalabel"."element" ASC) subquery; args=('xzl5_panc', 'serine', 'alafasted_cor.xlsx', 'C')
TEST NITROGEN COUNT
(0.002) SELECT COUNT(*) FROM (SELECT DISTINCT ON ("DataRepo_peakgroup"."id", "DataRepo_peakdatalabel"."element") "DataRepo_peakgroup"."id" AS Col1, "DataRepo_peakgroup"."name" AS Col2, "DataRepo_peakgroup"."formula" AS Col3, "DataRepo_peakgroup"."msrun_id" AS Col4, "DataRepo_peakgroup"."peak_annotation_file_id" AS Col5 FROM "DataRepo_peakgroup" INNER JOIN "DataRepo_msrun" ON ("DataRepo_peakgroup"."msrun_id" = "DataRepo_msrun"."id") INNER JOIN "DataRepo_sample" ON ("DataRepo_msrun"."sample_id" = "DataRepo_sample"."id") INNER JOIN "DataRepo_archivefile" ON ("DataRepo_peakgroup"."peak_annotation_file_id" = "DataRepo_archivefile"."id") INNER JOIN "DataRepo_peakdata" ON ("DataRepo_peakgroup"."id" = "DataRepo_peakdata"."peak_group_id") INNER JOIN "DataRepo_peakdatalabel" ON ("DataRepo_peakdata"."id" = "DataRepo_peakdatalabel"."peak_data_id") WHERE ("DataRepo_sample"."name" = 'xzl5_panc' AND "DataRepo_peakgroup"."name" = 'serine' AND "DataRepo_archivefile"."filename" = 'alafasted_cor.xlsx' AND "DataRepo_peakdatalabel"."element" = 'N') ORDER BY "DataRepo_peakgroup"."id" ASC, "DataRepo_peakdatalabel"."element" ASC) subquery; args=('xzl5_panc', 'serine', 'alafasted_cor.xlsx', 'N')
Case 4 (FAST) Django 3.2, postgres 10
python manage.py test
TEST CARBON COUNT
(0.001) SELECT COUNT(*) FROM (SELECT DISTINCT ON ("DataRepo_peakgroup"."id", "DataRepo_peakdatalabel"."element") "DataRepo_peakgroup"."id" AS Col1, "DataRepo_peakgroup"."name" AS Col2, "DataRepo_peakgroup"."formula" AS Col3, "DataRepo_peakgroup"."msrun_id" AS Col4, "DataRepo_peakgroup"."peak_annotation_file_id" AS Col5 FROM "DataRepo_peakgroup" INNER JOIN "DataRepo_msrun" ON ("DataRepo_peakgroup"."msrun_id" = "DataRepo_msrun"."id") INNER JOIN "DataRepo_sample" ON ("DataRepo_msrun"."sample_id" = "DataRepo_sample"."id") INNER JOIN "DataRepo_archivefile" ON ("DataRepo_peakgroup"."peak_annotation_file_id" = "DataRepo_archivefile"."id") INNER JOIN "DataRepo_peakdata" ON ("DataRepo_peakgroup"."id" = "DataRepo_peakdata"."peak_group_id") INNER JOIN "DataRepo_peakdatalabel" ON ("DataRepo_peakdata"."id" = "DataRepo_peakdatalabel"."peak_data_id") WHERE ("DataRepo_sample"."name" = 'xzl5_panc' AND "DataRepo_peakgroup"."name" = 'serine' AND "DataRepo_archivefile"."filename" = 'alafasted_cor.xlsx' AND "DataRepo_peakdatalabel"."element" = 'C') ORDER BY "DataRepo_peakgroup"."id" ASC, "DataRepo_peakdatalabel"."element" ASC) subquery; args=('xzl5_panc', 'serine', 'alafasted_cor.xlsx', 'C')
TEST NITROGEN COUNT
(0.001) SELECT COUNT(*) FROM (SELECT DISTINCT ON ("DataRepo_peakgroup"."id", "DataRepo_peakdatalabel"."element") "DataRepo_peakgroup"."id" AS Col1, "DataRepo_peakgroup"."name" AS Col2, "DataRepo_peakgroup"."formula" AS Col3, "DataRepo_peakgroup"."msrun_id" AS Col4, "DataRepo_peakgroup"."peak_annotation_file_id" AS Col5 FROM "DataRepo_peakgroup" INNER JOIN "DataRepo_msrun" ON ("DataRepo_peakgroup"."msrun_id" = "DataRepo_msrun"."id") INNER JOIN "DataRepo_sample" ON ("DataRepo_msrun"."sample_id" = "DataRepo_sample"."id") INNER JOIN "DataRepo_archivefile" ON ("DataRepo_peakgroup"."peak_annotation_file_id" = "DataRepo_archivefile"."id") INNER JOIN "DataRepo_peakdata" ON ("DataRepo_peakgroup"."id" = "DataRepo_peakdata"."peak_group_id") INNER JOIN "DataRepo_peakdatalabel" ON ("DataRepo_peakdata"."id" = "DataRepo_peakdatalabel"."peak_data_id") WHERE ("DataRepo_sample"."name" = 'xzl5_panc' AND "DataRepo_peakgroup"."name" = 'serine' AND "DataRepo_archivefile"."filename" = 'alafasted_cor.xlsx' AND "DataRepo_peakdatalabel"."element" = 'N') ORDER BY "DataRepo_peakgroup"."id" ASC, "DataRepo_peakdatalabel"."element" ASC) subquery; args=('xzl5_panc', 'serine', 'alafasted_cor.xlsx', 'N')
So the test slowness (per test) is apparently at least somewhat intermittent, which perhaps is why the running time on github has a wider running time range (60m ±15m). Out of the roughly half dozen times I’ve run all tests locally, this is the only time I’ve seen it run fast under Django 4.2 and postgres 13. Otherwise, that one test runs consistently slowly when running all tests and consistently fast when running an individual test.