Domain name validation - inconsistencies?

I came across a possible inconsistency in the regex for domain name validation in django/core/validators.py - it appears to be different in the URLValidator and the EmailValidator.

In the EmailValidator it looks like the regex doesn’t account for internationalised domain names with non-ASCII characters.

Should this be consistent in both the validators? How should we be handling the internationalised domain names?

This is related to the current ticket I’m looking at on adding a DomainNameValidator. I think the logic there should be something like:

if accept_idna:
    domain_name = punycode(domain_name) # i.e. convert to ASCII characters
    # accept a string that is just ASCII (and whatever other constraints there are for a domain name)
else:
    # just accept a domain name that's ASCII

But would be grateful for views/advice.

1 Like

Hi @nmenezes0.

Good question.

Have a look at #27029 (Make EmailValidator accept non-ASCII characters) – Django and the related threads there.

#27029 (Make EmailValidator accept non-ASCII characters) – Django and Claude’s reply cover the state-of-play I think.

Summary is that it’s not really feasible to account for everything possible here, so much better to ship a very simple validator, and then allow folks to supply a more sophisticated one if their project needs it.

The recent comment:36 is new since last time I checked, so if you wanted to dig into that it might be cool.

Kind Regards,

Carlton

1 Like

Thanks @carltongibson - will write a simple validator for the DomainNameValidation ticket. Will dig into the link in that recent comment to see if that can help with the domain name validation.

1 Like