I have been looking for a way to check that a redirect URL is “safe” (e.g. local to the current site). I have hand rolled a function to only leave the path without the domain. Then I thought this must be a common problem, there must be something in Django.
Turns out there was is_safe_url, which was renamed to url_has_allowed_host_and_scheme. However, both are undocumented and only mentioned in the release notes. And when they are mentioned it does not sound like you are supposed to be using them.
Why is that?
Then there is this quote from the 4.2 release notes:
This is to protect projects that may be incorrectly using the internal url_has_allowed_host_and_scheme() function, instead of using one of the documented functions for handling URL redirects.
The most interesting part is “instead of using one of the documented functions for handling URL redirects”. I tried but failed to find the documented functions for handling URL redirects. Does anyone have any pointers what those might be?
url_has_allowed_host_and_scheme is an internal function because it only does part of what’s needed to safely use a provided URL in a redirect. There’s escaping issues and other considerations, all of which need to be applied, and for which it’s easy to miss.
Before the change you link the function was called is_safe_url — which gave folks entirely the wrong impression: they’d use that — despite the warnings that say don’t — and think they’d done enough.
The release note is only there as a courtesy to those folks.
Again this is an internal function. Don’t use it.
If you want to do redirects you should always use the documented HttpResponseRedirect which correctly handles the various issues here for you.
I am not sure I understand how HttpResponseRedirect will make sure that the redirect is safe. Wouldn’t it happily redirect to an unsafe site if one used something like HttpReponseRedirect(url=request.GET["next"]) (ignoring that the key could be missing of course)?
I agree, the rename to url_has_allowed_host_and_scheme makes it much clearer of what the function does. If a URL that passes those checks can be considered safe would depend on the given context.
I understand that the function is meant for internal use and not part of the documented public API of Django.
What I don’t quite understand is: why?
It seems like a perfectly valid utility function that does what it says.
It is used exactly how you would expect in the django.auth.views module to check the “next” parameter.
What I don’t understand is what would make using the function elsewhere in the exact same manner (to check a URL coming from a query parameter) so worrisome?
EDIT: I guess my question is: Why is it considered “private and internal”?
The code you quote there goes on to use the redirect URL with an HttpResponseRedirect. The latter then performs required escaping before using the URL. It’s only in such combination that a URL is safe.
Django doesn’t provide equivalent utilities as part of the public API because doing so in a generic way but such that folks won’t step into security problems has not been considered feasible.
The URL escaping you mention, is that the typical % escaping of potential query parameters in the URL which was checked to be of allowed host and scheme? We would not want to escape the whole URL right, because then it wouldn’t work as a URL?
Ah, OK. You want the full details. That will require me to go over the history. I’m happy to do that on the back burner, for my own refresh, but it won’t be instant I’m afraid.
The bottom line is this API is tricky to use correctly, safely. If Django documents it as public API, folks will use it incorrectly, expose themselves to security issues, and blame Django — no matter what warnings are in place. That doesn’t correspond to Django’s safe by default goals. (And I don’t really see a way to square that circle.)
If you think you understand all the issues, and can trust yourself and your team to use these methods safely and without mistake, go ahead of course. But caveat emptor — the risk there is on you.
I don’t think that’s true. url_has_allowed_host_and_scheme definitely has a place here. Accepting a query param of a URL and redirecting to it without validation is a security vulnerability: Open redirect vulnerability | Tutorials & examples | Snyk Learn.
HttpResponseRedirect (and the redirect shortcut which returns it) does harden the redirect process (more so than building the Location header yourself), specifically on the response side. When it comes to validating where to redirect the user, that’s where url_has_allowed_host_and_scheme (and others) comes in. Any input coming from the user must be validated. Checking input is a URL is (relatively) easy, or at least Django already has a public API to do it. But checking that you’re about to redirect the user somewhere they’re allowed to is still a necessary part of the process. In many cases, you may only want to accept relative redirect URLs, which is fine and not too hard to implement.
Personally, both with and without my Security team hat on, I could definitely see the value in url_has_allowed_host_and_scheme being part of the public API, or even being run as part of the redirect responses (probably not though, it’d probably be very breaking). Or, if it’s kept internal, a unified API which handles all the necessary validation (there’s not that much, but missing one makes the rest useless) and does a redirect.
I completely agree with @carltongibson that it’s not the full story when it comes to safely handling URLs, but seeing as it’s a critical piece, and one Django already does internally as standard in say the admin, it makes sense to me to expose that functionality publicly. Once it is, we could (and probably should) also do more to document how to redirect securely (validate host, scheme changes, iri_to_uri etc), rather than leaving it up to the user to work out (as @tbrlpld has had to).
I wouldn’t handle a next parameter to an unknown URL. I’m going to know the allowed targets ahead of time, and the “is this allowed” logic is much less generic than Django’s internal handling needs to be. (TBH I don’t see why folks want something fully generic here, which is always going to be hard to audit: just write the function you need.)
I then use redirect response to ensure the proper escaping.