Make the system check look at the types of settings?

I was working on #35141 (CACHE_MIDDLEWARE_SECONDS can be set to a float but has to be an int) – Django .

There a misconfigured CACHE_MIDDLEWARE_SECONDS setting leads to a wrong ‘Cache-Control’: ‘max-age=2.0’ header.

At first I thought it would be a good idea to just cast this to an int (like in this ticket #31982 (Convert max_age to an int in set_cookie()) – Django)

But then in another ticket #35041 (DATA_UPLOAD_MAX_MEMORY_SIZE causes a confusing error when not an integer) – Django this was rejected:

We have dozens of settings and we cannot add type checks for all of them to the Settings. Especially when an expected type is clearly documented.

So I head the idea to use the system check to look for sane setting types at the start. In the example above the output would be something like this:

System check identified some issues:

WARNINGS:
?: (files.W001) The type of CACHE_MIDDLEWARE_SECONDS should be int (default 600). see https://docs.djangoproject.com/en/dev/ref/settings/#std-setting-CACHE_MIDDLEWARE_SECONDS

For some misconfigured settings there will still be an explicit ImproperlyConfigured exception raised (like The ALLOWED_HOSTS setting must be a list or a tuple) or an implicit exception like AttributeError: ‘int’ object has no attribute ‘find’ when for example set LANGUAGE_CODE to an int.

The implementation checks the type of the default value against the type of the user setting and issues a warning if they don’t match. This works for most settings, since they do have a default value other than None. For the ones that have None as the default value, a table is present, that contains the expected default values.

Some poc code is here: Check settings types during systems checks · e11bits/django@c05225c · GitHub

Is this something people would find useful? Should I go forward with this and create a PR?

Side note: Your post was made on a Friday afternoon for many of us in the US. This would be Friday evening for those in London and Saturday morning for the Australians. Not everyone who follows this board does so on weekends, especially if they’re the type of people who want to seriously read what you’ve written along with the references you’ve posted. I suggest you give it at least 3 more days before drawing any conclusions about responses (or lack thereof).

Hi @e11bits

So system checks is a tricky one. Folks do complain if they start running slowly, so we do need to be careful about adding too many.

There was a thread on mastodon the other day about controlling when system checks are run. (There’s --skip-checks and check tags, but folks seem to want more, or better, ways of controlling them.)

There are also ideas about having declarative settings objects, which are type and/or validate values.

(I don’t have links to hand sorry.)

Pushing on either of those gets past/works around/alieviates the concern about too many checks.

That would be my suggestion to move forward.

Hi @carltongibson

Honestly, I feel a little bit like running around in circles with this.

I started with working on #27225 (and this is actually still my main objective to get it fixed).

Then I saw that Django sets the max-age directive of the Cache-Control header to a float, if for example CACHE_MIDDLEWARE_SECONDS is a float. There are also two one testcase, where this timeout is incorrectly set to a float. As far as I understand the section in RFC9111 the “number of seconds” has to be an int and not a float for max-age. And django.utils.cache.get_max_age(respone) also expects this to be an int and returns None otherwise.

The implementation to fix #27225 and calculate the Age depends on that the cache is setting the max-age for a cached response.

Changing get_max_age to also parse floats seems wrong, because it is actually correct to expect ints only.

Casting CACHE_MIDDLEWARE_SECONDS to an int wherever used (3-4 locations) seemed to be the right solution to me, since I found #31982 where this was fixed in same manner.

Then I was told that if we start casting this setting to int we probably have to check other settings as well and that this was not desirable.

In the meantime #35141 was accepted, but basically reduced to only make the documentation more clear about CACHE_MIDDLEWARE_SECONDS has to be an int.

I still wanted to find a way to make it obvious, when CACHE_MIDDLEWARE_SECONDS is misconfigured.

If you set it to a float, no Exception will be raised and besides that the directive max-age is wrongly set to a float in responses, everything works and the misconfigured setting is hard to spot.

My last attempt was to have some system check that would output a warning about the misconfigured setting. Just doing this for one setting seemed too limited, so I extended it to most settings.

Now I understand that this is not desirable for different other reasons.

At this point I’m just ready to leave it as it is, because I can’t find a solution that pleases everybody.

I’m all for more system checks. I agree with Carlton that we don’t want to burden the framework with startup overhead (and memory usage for the check functions).

But this check should be fast, because it boils down to some isinstance() calls, which are very fast. And it should be small - the PoC is not much code.

Yes, type checkers would be the natural tool to cover this issue, but most projects don’t use one, at least for the medium term. Protecting users from configuration issues is worth the duplicated effort here IMO.

(I can’t recall if django-stubs checks the types of defined settings at the moment.)

2 Likes

Ok, so I want to pick this up again.

The PoC right now uses one warning id for every setting. Maybe it would be nice to have a unique warning id for each setting?

Something like

settings.W001 ABSOLUTE_URL_OVERRIDES
settings.W002 ADMINS

or maybe not alphabetically, but in groups?

settings.W101 DATE_INPUT_FORMATS
settings.W102 TIME_INPUT_FORMATS
settings.W103 DATETIME_INPUT_FORMATS
settings.W201 USE_THOUSAND_SEPARATOR
settings.W202 NUMBER_GROUPING
settings.W203 THOUSAND_SEPARATOR

With this you could have a finer control what to silence and what not.

SILENCED_SYSTEM_CHECKS = ["settings.W101", "settings.W102"]

But it would be cumbersome to eg. silence all settings warnings. With a small change of the is_silenced method, this would also be possible:

@@ -53,5 +53,7 @@
     def is_silenced(self):
         from django.conf import settings

-        return self.id in settings.SILENCED_SYSTEM_CHECKS
+        return self.id and any(
+            self.id.startswith(prefix) for prefix in settings.SILENCED_SYSTEM_CHECKS
+        )

To silence all settings warnings:

SILENCED_SYSTEM_CHECKS = ["settings"]

Or maybe just the date/time input formats warnings:

SILENCED_SYSTEM_CHECKS = ["settings.W1"]

And I think it could be useful for other checks as well and it would not break any existing SILENCED_SYSTEM_CHECKS settings.

But maybe having a unique warning id for each setting is going over board and this is all not needed?

Maybe just a prefix is not obvious enough and some glob pattern style is better:

SILENCED_SYSTEM_CHECKS = ["settings.*"]
SILENCED_SYSTEM_CHECKS = ["settings.W1??"]
@@ -51,9 +51,15 @@ class CheckMessage:
         return self.level >= level
 
     def is_silenced(self):
+        if not self.id:
+            return False
+
+        from pathlib import PurePath
+
         from django.conf import settings
 
-        return self.id in settings.SILENCED_SYSTEM_CHECKS
+        id_as_path = PurePath(self.id)
+        return any(id_as_path.match(glob) for glob in settings.SILENCED_SYSTEM_CHECKS)
 
 
 class Debug(CheckMessage):

Ok, I abandoned the last idea. Seems not worth the effort and change.

Let’s use a single warning for all settings for now. There’s an open ticket to improve the granularity of silencing system checks, hopefully that would allow for per-setting silencing.

FYI I just realized there’s already a number of existing system checks for types in the admin, see django/django/contrib/admin/checks.py at 7ba6c9edc50dc989fc5c306b541636249b952f93 · django/django · GitHub

1 Like

I saw there is an overlap, but I thought it would be still a benefit, to have a check for all Django settings.

For what’s it worth: I opened a ticket #35231 (Add system checks for settings types) – Django and a draft PR Fixed #35231 -- Adds system checks for settings types updated. by e11bits · Pull Request #17875 · django/django · GitHub

I did some benchmarking with the settings as they are after startproject:

Without settings system checks:

Benchmark 1: ./manage.py check
Time (mean ± σ): 524.6 ms ± 48.4 ms [User: 441.2 ms, System: 82.7 ms]
Range (min … max): 467.0 ms … 781.2 ms 100 runs

With settings system checks:

Benchmark 1: ./manage.py check
Time (mean ± σ): 535.2 ms ± 46.0 ms [User: 449.3 ms, System: 85.4 ms]
Range (min … max): 471.5 ms … 740.9 ms 100 runs

So on average the startup time on my (slow) system takes about 10ms longer (or ~2%) with the checks.

That doesn’t seem too burdensome.

Maybe not, we all know how this ends, folks who make some tricky things, where something behave like something else, but does not pass isinstance() check will report a regression.

1 Like

isinstance() check will also force evaluation for lazy settings which may be unexpected, e.g. hypothetical

def get_time():
    return int(time.time())

CACHE_MIDDLEWARE_SECONDS = SimpleLazyObject(get_time)

>> CACHE_MIDDLEWARE_SECONDS
<SimpleLazyObject: <function get_time at 0x7fce55633d90>>
>> isinstance(CACHE_MIDDLEWARE_SECONDS, int)
True
>> CACHE_MIDDLEWARE_SECONDS
<SimpleLazyObject: 1708333848>
>> CACHE_MIDDLEWARE_SECONDS
<SimpleLazyObject: 1708333848>

Yes. And that’s basically certain to happen. (We broke code along this exact pattern with an ALLOWED_HOSTS change a wee while back.)

Aside: I feel like we’ve been round this block (or one exactly like it) several times over the years. The suspicion I have is that the only real road open is along the typed declarative settings objects type route (that would have other benefits than just catching type errors). Whether we can actually move on a systems check approach I don’t know. Yes, you can silence them… but even with that we get non-negligible numbers of complaints when we do (it feels like) anything here. (Where there’s a clear benefit we can just say “update your code”, but whether that applies here.).

1 Like

We were both down this deep rabbit hole … :hole: and it’s a scary place :see_no_evil:

I don’t want to be a nuisance, but maybe I just need to see for myself and learn from there?
It’s ok if the PR will be rejected because of some experiences in the past, but at least I’m trying… :laughing:

Yes, so I extended the code to cover something like that:

+    def test_setting_lazy(self):
+        weeks = lazy(lambda w: 60 * 60 * 24 * 7 * w, int)
+        self._assertCheckPass(SESSION_COOKIE_AGE=weeks(2))

In this case the check will not check SESSION_COOKIE_AGE. So you’re saying this would not be enough? Would you have some examples for me? Or a pointer to a thread?

    def test_setting_lazy_object(self):
        def get_time():
            return int(time.time())

        self._assertCheckPass(CACHE_MIDDLEWARE_SECONDS=SimpleLazyObject(get_time))

Will pass now as well, (because it will not be checked).

I think the key issue being presented by Mariusz is that even if we ensure that CACHE_MIDDLEWARE_SECONDS works well with lazy objects, we still could have the same “false positive check” issue with other settings…
Unless I’m missing something, I don’t see with clarity what our strategy would be to make the checks more restrictive and yet not “breaking” projects out there making “perhaps unorthodox” usage of settings.

While reading these posts (thank you for your effort and dedication so far!), I keep wondering two things:

  1. Whether we should (instead) consider a more radical approach and re-design settings. For example going the route that Carlton proposes with declarative settings.
  2. Whether there are other framework/tools out there that provide this functionality (typed configs/settings) so we don’t re-invent the wheel. In the past I did work with a “typed config parser”[0] that is currently unmaintained, and I see at least one other library providing this concept[1], though is quite unclear we would like to go down this path.

Now, in my personal opinion, the current settings, while not perfect, are reasonably good, and it’s unclear to me whether efforts in this area are worth the time investment, particularly given the likely pushback we’d get from users doing custom things around them.

[0] configglue | Read the Docs
[1] GitHub - ajatkj/typed_configparser: A fully typed configparser built on top of configparser

1 Like

Yes. The whole settings objects thing only makes sense in relation to other discussions about grouping (and encapsulating) related sets of settings. Email. The various security headers, and so on.