Why is the use of `url_has_allowed_host_and_scheme` discouraged

tbrlpld · October 1, 2024, 10:17pm

Hello Djangonauts.

First time poster here .

I have been looking for a way to check that a redirect URL is “safe” (e.g. local to the current site). I have hand rolled a function to only leave the path without the domain. Then I thought this must be a common problem, there must be something in Django.

Turns out there was is_safe_url, which was renamed to url_has_allowed_host_and_scheme. However, both are undocumented and only mentioned in the release notes. And when they are mentioned it does not sound like you are supposed to be using them.

Why is that?

Then there is this quote from the 4.2 release notes:

This is to protect projects that may be incorrectly using the internal url_has_allowed_host_and_scheme() function, instead of using one of the documented functions for handling URL redirects.

— Django 4.2 release notes | Django documentation | Django

The most interesting part is “instead of using one of the documented functions for handling URL redirects”. I tried but failed to find the documented functions for handling URL redirects. Does anyone have any pointers what those might be?

carltongibson · October 2, 2024, 8:12am

url_has_allowed_host_and_scheme is an internal function because it only does part of what’s needed to safely use a provided URL in a redirect. There’s escaping issues and other considerations, all of which need to be applied, and for which it’s easy to miss.

Before the change you link the function was called is_safe_url — which gave folks entirely the wrong impression: they’d use that — despite the warnings that say don’t — and think they’d done enough.

The release note is only there as a courtesy to those folks.

Again this is an internal function. Don’t use it.

If you want to do redirects you should always use the documented HttpResponseRedirect which correctly handles the various issues here for you.

tbrlpld · October 2, 2024, 2:28pm

Thanks for your response @carltongibson.

I am not sure I understand how HttpResponseRedirect will make sure that the redirect is safe. Wouldn’t it happily redirect to an unsafe site if one used something like HttpReponseRedirect(url=request.GET["next"]) (ignoring that the key could be missing of course)?

tbrlpld · October 2, 2024, 2:50pm

I agree, the rename to url_has_allowed_host_and_scheme makes it much clearer of what the function does. If a URL that passes those checks can be considered safe would depend on the given context.

I understand that the function is meant for internal use and not part of the documented public API of Django.

What I don’t quite understand is: why?

It seems like a perfectly valid utility function that does what it says.

It is used exactly how you would expect in the django.auth.views module to check the “next” parameter.

github.com

django/django/blob/6765b6adf924c1bc8792a4a454d5a788c1abc98e/django/contrib/auth/views.py#L43-L53


      
          def get_redirect_url(self):
              """Return the user-originating redirect URL if it's safe."""
              redirect_to = self.request.POST.get(
                  self.redirect_field_name, self.request.GET.get(self.redirect_field_name)
              )
              url_is_safe = url_has_allowed_host_and_scheme(
                  url=redirect_to,
                  allowed_hosts=self.get_success_url_allowed_hosts(),
                  require_https=self.request.is_secure(),
              )
              return redirect_to if url_is_safe else ""

What I don’t understand is what would make using the function elsewhere in the exact same manner (to check a URL coming from a query parameter) so worrisome?

EDIT: I guess my question is: Why is it considered “private and internal”?

carltongibson · October 2, 2024, 3:43pm

The code you quote there goes on to use the redirect URL with an HttpResponseRedirect. The latter then performs required escaping before using the URL. It’s only in such combination that a URL is safe.

Django doesn’t provide equivalent utilities as part of the public API because doing so in a generic way but such that folks won’t step into security problems has not been considered feasible.

tbrlpld · October 3, 2024, 7:06pm

Thanks for expanding on this @carltongibson.

The URL escaping you mention, is that the typical % escaping of potential query parameters in the URL which was checked to be of allowed host and scheme? We would not want to escape the whole URL right, because then it wouldn’t work as a URL?

Similar to what’s mentioned on this cheat sheet?

carltongibson · October 3, 2024, 7:50pm

Ah, OK. You want the full details. That will require me to go over the history. I’m happy to do that on the back burner, for my own refresh, but it won’t be instant I’m afraid.

tbrlpld · December 12, 2024, 12:12am

I would definitely be happy about anything that helps me understand this fully.

Sorry for the late response I did not get a notification for some reason.

carltongibson · December 12, 2024, 6:56am

OK, no problem. I’ll see if I can write something up. (No deadline! )

tl;dr will be ≈"folks still need to correctly escape the URL, and that’s non-obvious"

tbrlpld · December 12, 2024, 1:47pm

Thanks @carltongibson. That would be much appreciated.

Topic		Replies	Views
Is my domains middleware safe? Getting Started	3	243	December 26, 2023
Error redirecting Using Django	12	3135	April 20, 2020
{{ request.scheme }} is not honoring https Deployment	3	584	November 7, 2023
Best way to handle virtual hosts in Django? Using Django	6	778	April 11, 2020
Redirectview error Using Django	3	1081	April 14, 2020

Why is the use of `url_has_allowed_host_and_scheme` discouraged

Related topics