urlize filter mangling encoding in URLs

If I put this in a Django template:

{{ "https://example.com/data=https:%2F%2Ffoo.org%2F"|urlize }}

then it generates this HTML (line breaks and indentation added for readability):

<a href="https://example.com/data=https://foo.org/" rel="nofollow">
  https://example.com/data=https:%2F%2Ffoo.org%2F
</a>

As you can see it changes any encoded %2F into a /.

I realise you’d usually put this kind of stuff in GET args, but sites like Google Street View have URLs like that, e.g. https://www.google.com/maps/@51.4617759,-0.1456679,3a,75y,327.01h,88.41t/data=!3m7!1e1!3m5!1sCShFitAntdbZQpjV4_T-pA!2e0!6shttps:%2F%2Fstreetviewpixels-pa.googleapis.com%2Fv1%2Fthumbnail%3Fpanoid%3DCShFitAntdbZQpjV4_T-pA%26cb_client%3Dmaps_sv.share%26w%3D900%26h%3D600%26yaw%3D327.01%26pitch%3D1.5900000000000034%26thumbfov%3D90!7i16384!8i8192?coh=205410&entry=ttu

And using the urlize filter on that mangles the URL.

I wonder if this is either a bug or a security feature? Is there a safe way around it, for rendering links to user-submitted URLs?

Are you sure you wrote the url correctly?

Where is the question mark??

https://example.com/data=https:%2F%2Ffoo.org%2F
=>
https://example.com/?data=https:%2F%2Ffoo.org%2F

The question mark is about 25 characters before the end. The only query data provided in the request is “coh=205410&entry=ttu”.
The rest of that text is all part of the url.

As @KenWhitesell said, yes that url is correct. Weird and annoying, but correct.

I think this is an intentional conversion.
If you don’t want this, you have to write your own template filter or create the a tag in some other way.

urlize uses Urlizer.

And Urlizer uses smart_urlquote, where unquote is used.