Currently, Django’s password validators accept weak passwords like aaaaaaaaaaaa ('a' * 12) without issue. I propose adding a built-in password validator that detects repeated patterns, which would significantly improve password complexity. Implementing this is straightforward:
import re
repeat_matcher = re.compile(r'(.+?)\1+')
match = repeat_matcher.match(password)
repeat_cnt = len(match.group(0)) // len(match.group(1)) - 1 if match else 0
repeat_cnt for alaalaala should be 2.
Additionally, Django could check for common weak patterns, such as consecutive letters or predictable character sequences.
While third-party password validation packages exist, strengthening Django’s defaults would benefit developers who may not research password security thoroughly. The proposed validators are simple to implement and would meaningfully enhance security with minimal added complexity.
>>> password = 'ozozozdaiS3vohQuoohp9kaexo1zeirieC4soZ'
>>> match = repeat_matcher.match(password)
>>> repeat_cnt = len(match.group(0)) // len(match.group(1)) - 1 if match else 0
>>> repeat_cnt
2
This also yields a repeat_cnt of 2 but is a rather secure password.
While I agree that it would be nice to detect more classes of insecure passwords – the implementation probably needs a bit more work than what you just offered. And considering regexes: We need to ensure that it is not vulnerable to catastrophic backtracking.
I think a small adjustment could work that keeps it straightforward is to have it take into account the length of the password. For example two repeating characters in a password of length 20 is probably fine, 5 repeating characters in a password of length 8 is probably not so great.
But it’s still very naive. It doesn’t help against any of the following:
azazazazazaz
aazaazaazaaz
abcabcabcabc
What might be more useful is looking at the total diversity of characters while taking the length into account. For example, in the first example, there are only two unique characters in a password of length 12, again only two in the second example, and three in the third. This feels like it catches more types of issues. I’m sure it also has its problems
One in particular I can think of is how to convey to the user what’s acceptable and what isn’t? At least x different characters probably works well enough if we assume most people are going by NIST’s recommendations and using a minimum 8 characters, and not bother taking the password length into account.
Rather than reinventing the wheel, I would recommend that if any work is done here, it should be to use an existing password strength estimation library like zxcvbn (see also zxcvbn-python) or libpwquality.
I don’t think we should add repeated pattern detection to Django’s built-in password validators. It’s not entirely clear that it would improve security. And simplistic implementations could interfere with using a password manager’s random generator (which does improve security).
Maybe a validator like “must include at least n different characters” (len(set(password)) >= n) would be workable. But even then I’m not certain it’s helpful for enforcing password strength. (Are you sure that a 32 character long password is insufficiently strong if it only contains D, j, and o?)
I’m not a security expert. It’s a tricky topic, and my intuition is often incorrect. So I try to seek out expert guidance.
Recommendations on password requirements have changed over time. Some common rules—ones that seemed like perfectly good ideas—have been shown to actually decrease overall security.[1]
The latest guidance on password strength rules seems to be:
Require a minimum length: at least eight chars (and at least 15 is better).
Check against a blocklist of commonly used/breached/likely passwords.[2]
And nothing else. In particular, “complexity” and “composition” requirements are no longer recommended.
(Also, be sure to rate limit your login form! And don’t block paste from a password manager!)
Trying to prevent repeated patterns is an example of a complexity and composition requirement. So is requiring a minimum number of distinct characters.
Using complexity requirements (that is, where staff can only use passwords that are suitably complex) is a poor defence against guessing attacks. … Additionally, complexity requirements provide no defence against common attack types such as social engineering or insecure storage of passwords.
For the above reasons, the NCSC do not recommend the use of complexity requirements when implementing user generated passwords. The use of technical controls to defend against automated guessing attacks is far more effective than relying on users to generate (and remember) complex passwords. However, you should specify a minimum password length, to prevent very short passwords from being used.
(NCSC also has some caveats about password strength calculators, just above the section I linked.)
The US NIST's SP 800-63B draft fourth revision,section 3.1.1.2:
1. Verifiers and CSPs [credential service providers] SHALL require passwords to be a minimum of eight characters in length and SHOULD require passwords to be a minimum of 15 characters in length.
…
5. Verifiers and CSPs SHALL NOT impose other composition rules (e.g., requiring mixtures of different character types) for passwords.
…
When processing a request to establish or change a password, verifiers SHALL compare the prospective secret against a blocklist that contains known commonly used, expected, or compromised passwords. The entire password SHALL be subject to comparison, not substrings or words that might be contained therein.
Also check out NIST’s Appendix A for less formal background info and the rationale behind their guidelines.
E.g., requiring password rotation on a forced schedule. Or forbidding punctuation characters in passwords, because that might lead to SQL injection. See NIST SP 800-63B, draft fourth revision, Appendix A for helpful discussion and citations. ↩︎
FWIW, the "a" * 12 example from the OP appears in haveibeenpwned’s list of previously breached passwords—though not Django’s built-in common-passwords.txt (but "a" * 10 is in there). There are a number of third-party Django libraries that can check the pwned-passwords database. ↩︎
Compared to the current version, the draft fourth revision clarifies blocklist application to prohibit checking substrings of the password against the blocklist, and removes earlier guidance that the blocklist “MAY include … [r]epetitive or sequential characters (e.g. ‘aaaaaa’, ‘1234abcd’).” It also drops the earlier recommendation to include a password strength meter. ↩︎
I’m not in favor of adding any constraints. However, I could see that the community agrees on a particular way of implementing a stronger password validation and just put it in the docs as an example how to implement a third-party package that does it.
Thank you for the detailed replies. I agree that measuring the security of passwords is not straightforward and that a bad implementation will do more harm than good.
When processing a request to establish or change a password, verifiers SHALL compare the prospective secret against a blocklist that contains known commonly used, expected, or compromised passwords.
I saw that recommendation. However, it is not very helpful. It doesn’t tell how big the blocklist should be. Should the blocklist containing 20k entries that is used by Django considered sufficient?