Extract() isn't safe to use in db_default on SQLite if TIME_ZONE changes at some point

If:

  • a field’s database default uses Extract()
  • UZE_TZ = True
  • TIME_ZONE is something other than UTC

… then the timezone baked into the database default will differ from the timezone used by Django to generate values.

Further, on SQLite–because DEFAULT keywords are not supported on INSERT–this can manifest interesting problems where the value generated for a field depends on whether or not any other fields have been set. See this fiddle:

class Person(models.Model):
    other = models.CharField(db_default="other")
    hour = models.IntegerField(db_default=ExtractHour(Now()))
    with override_settings(TIME_ZONE="America/Chicago"):
        instance = Person.objects.create()
        print(f'Created: {instance}')
        instance2 = Person.objects.create(other="blah")
        print(f'Created: {instance2}')
Created: Hour: 20
Created: Hour: 1

The two insert statements are:

INSERT INTO "app_person" ("hour")
VALUES (django_datetime_extract('hour', STRFTIME('%Y-%m-%d %H:%M:%f', 'NOW'), 'America/Chicago', 'UTC'))
INSERT INTO "app_person" ("other")
VALUES ('blah')

…notice how only America/Chicago sneaks into the first INSERT. Here is the code emulating the database default.


We cannot really expect the SQLite backend to inspect the database to get the timezone baked into the default. Should the autodetector detect this situation and make a migration? (That wouldn’t catch uses in tests that manipulate settings, which is where I found this.[1])


  1. At the time, I thought there was a relevant difference between create() and bulk_create() in this regard, but I’m having trouble producing a difference on DryORM. I might retry later with a local test project. ↩︎

2 Likes

It’s really a shame that SQLite doesn’t support the DEFAULT keyword because while INSERT ... DEFAULT VALUES would work for a simple row it won’t work for bulk_create mixing fields using DEFAULT.

I see a few ways to resolve this problem that don’t involve the auto-detector (which I think we should avoid doing).

The first one would be (finally) introduce a sql.SQLCompiler DDL compiler that Extract.as_sqlite could specialize over by using the timezone of the project (not the currently overridden one). That would make as_sqlite use the same timezone as the one used when the database level default is created when the project is setup. We could also specialize by introspecting the database default, as you’ve mentioned, but that’s potentially costly and hard to cache adequately.

The second would be to enable supports_default_keyword_in_insert on SQLite and emulate it in its insert compiler. By partitioning inserts by each combination of tuples it should be possible to avoid using DEFAULT in bulk_create and rely on DEFAULT VALUES when no columns are specified.

1 Like

Interesting! For the first option, how would Django recover the original time zone used? I was assuming that information was simply lost on any settings change without introspecting the db.

1 Like

Something like the following could do

diff --git a/django/db/models/functions/datetime.py b/django/db/models/functions/datetime.py
index 842987bf26..e5a4664a76 100644
--- a/django/db/models/functions/datetime.py
+++ b/django/db/models/functions/datetime.py
@@ -22,7 +22,7 @@ from django.utils import timezone
 class TimezoneMixin:
     tzinfo = None

-    def get_tzname(self):
+    def get_tzname(self, compiler):
         # Timezone conversions must happen to the input datetime *before*
         # applying a function. 2015-12-31 23:00:00 -02:00 is stored in the
         # database as 2016-01-01 01:00:00 +00:00. Any results should be
@@ -30,7 +30,10 @@ class TimezoneMixin:
         tzname = None
         if settings.USE_TZ:
             if self.tzinfo is None:
-                tzname = timezone.get_current_timezone_name()
+                if ddl_compiler_heuristict(compiler):
+                    tzname = timezone.get_current_timezone_name()
+                else:
+                    tzname = timezone.get_default_timezone_name()
             else:
                 tzname = timezone._get_timezone_name(self.tzinfo)

The column’s database DEFAULT is created with the project default TZ so we must ensure that it’s honoured even if the expression is compiled at a time when it’s overridden.

1 Like

I follow how that will help the case where TIME_ZONE is overridden in tests, but I’m also concerned with the case where TIME_ZONE is changed during the life of a project. We could say this should be caught when testing your upgrade, but I’m thinking this situation would be difficult to detect, given how the problem manifests in edge cases depending on the ordering of the fields, how many values are being created, etc.

(Maybe this won’t come up in practice, if we suspect most changed values of TIME_ZONE during the life of a project involve switching to UTC rather than away from it.)

1 Like

If we also want to cover the case of changing TIME_ZONE during the lifetime of the project we could adjust Extract.deconstruct to prevent not providing a tzinfo (through a deprecation period) and then adjust it to return tzinfo=get_current_timezone() when not provided.

This would ensure that the timezone is always captured in the migrations file and if it ever change in the future the adequate operations would be generated.

I see it as a distinct problem than the db_default one though.

2 Likes

Ah, perfect. That same deprecation idea came up elsewhere, so I’ll open a ticket for it.

I’ll open an issue for the db_default issue. I wonder if the second approach:

The second would be to enable supports_default_keyword_in_insert on SQLite and emulate it in its insert compiler

… would be both simpler to implement and provide a performance boost?

1 Like

Tickets opened:

1 Like