Entropy isn't sufficient to measure password strength

Entropy isn't sufficient to measure password strength(benwr.net)

34 points by benwr 4 years ago | 120 comments

bigiain 4 years ago |

> Because choosing good passwords is about memorableness as well as sheer strength

That's not been true ever since the development of good password managers. There are fewer than 10 passwords I remember. One of them is my password manager's master passphrase (5 misspelled-and-with-random-punctuation words). The others include stuff like my work and home laptop/disk passwords, which I can't autofill, my 3 important banking passwords which I do not even entrust to my password manager, and my AppleID password because iOS is annoying enough at asking for that that I'm using one I can remember.

The other ~600 entries in my password manager are 25 random characters (or whatever the upper limit if password length is for sites/services that are 'doin it wrong').

Biganon 4 years ago | |

One could argue that you still need to remember your master password, and since it gives access to all your other passwords, it's all the more important to make it extremely strong. Therefore the randomness/memorability trade-off is still very important.

shepherdjerred 4 years ago | | |

Yes, but it’s not too hard to make one ridiculously long/complicated master password that is also memorable. It might take you a while to remember it — just keep it written down on paper somewhere private & safe and refer to it as needed. If you’re not being targeted then you’ll probably be fine.

omegalulw 4 years ago | |

> That's not been true ever since the development of good password managers.

A lot of people (do not trust password managers, case in point the recent last pass scare.

You want passwords to your key accounts to be 1) memorable 2) strong 3) only in your head. For these, I think the article is fairly relevant.

awelxtr 4 years ago | | |

> A lot of people (do not trust password managers, case in point the recent last pass scare.

That's no excuse. KeePass allows having the database file locally where it's you duty to manage it.

It might be less convenient, maybe. But I don't see valid excuses for people to not start using a password manager, even less the less tech savvy people.

smaudet 4 years ago | | |

The problem with password managers was they were a commercial venture - not that commercial is inherently in the general case worse, but:

1. Closed source, so you cannot audit a critical peice of security infrastructure. 2. Perverse incentives - they want to make money, so they are naturally going to encourage new versions over old and deprecate support for old programs. 2a. If your company of choice has not great business they have an active incentive to sell your data (including bank passwords) on the black market. 3. A need to keep "Up to date" i.e. jam whatever hot takes into your app to up the selling appeal - you want your security to be very boring, having a bunch of new features mixed into every release is a recipe for insecurity and disaster. 4. Cloud access - this leads on from the last point, but as soon as you store your stuff on a third party server, even encrypted, your potential leaks go from your computer, to every device between you and the remote, and then some (all third party integrations). Which has the side effect, said companies must start (complex) security auditing practices with all the fun and failure points that brings...

Now, even on the open source side:

1. As soon as you have to update your password manager, you might as well throw away all passwords and start over: a) Can you really trust that no source was beached during the update? b) How do you know it is even a legitimate update? Better not have put your password for updating things in your password manager... c) It's open source, great, so you can audit it but...will you? d) Or will you just trust it and because some guy who wasn't getting paid and is trying to get through school and hold a part time job missed a critical bug, you end up with all your passwords compromised anyways. 2) Deserves status as its own point, Open Source is auditable but not necessarily trustworthy, not without a lot of active oversight.

As such, one can conclude that such programs are mostly collosal wastes of time, if not actively endagering security.

Even as a 'better than nothing', they are a bad idea, to the layfolk who don't know any better its just another potential bad practice they are getting drilled into them.

I would argue that writing down passwords on paper is usually a better practice than using a password manager, at least that can be locked up in your home (and if you can get into my home I have other bigger worries).

Instead we should focus on giving back some responsibility to the user - most sites don't need passwords, if you are using a password manager for those sites you should presume that password is low security.

It would be better if we could codify the importance of a password somehow.

benwr 4 years ago | |

I mostly agree, but I do find myself choosing a new FDE and login passphrases about once a year, and I wish that I could choose these using something like Diceware, but memorable enough that I wouldn't need to write them down at all. Thinking about how I might do that is what ultimately led to this post.

shepherdjerred 4 years ago | | |

taeric 4 years ago | |

I'm curious if you have a rotation/audit practice for those? With 600 odd passwords, I'm not even sure how I would keep track of access to the items being protected.

bigiain 4 years ago | | |

Rotating/expiring random 25 char passwords is unnecessary.

One big advantage of a password manager is you _can_ audit accounts/passwords. I do a once a year sweep of my personal ones in KeePass, and use it as an opportunity to close accounts on services I no longer use. (Not that I believe any 3rd party service can be trusted to actually delete your data when you close your account, but spending 5 minutes updating your profile with junk data before deleting it improves your chances of not ending up on spam lists or automated credential stuffing attacks when that service gets popped.)

For the work shared passwords we use 1Password, which while I prefer their old standalone app over their new cloud thing, they do two very useful things - 1) integrate with HIBP's password checking service so it warns you when you have a password that's been published in a dump, and 2) provides an audit trail of which credentials each team member has ever accessed, so you can revoke only what's needed instead of rolling all shared passwords every time a staff member leaves.

nextaccountic 4 years ago | |

> 5 misspelled-and-with-random-punctuation words

Why misspell and add random punctuation?

Biganon 4 years ago | | |

So that they can't be found in a dictionary

Granted, 5 words chosen truly randomly from an English dictionary is already insanely strong, but why not make it slightly stronger?

c0balt 4 years ago | | |

Most likely to make dictionary attacks against the password(s) ineffective.

bell-cot 4 years ago |

Maybe I just don't have trendy-enough coworkers or friends...but I know of no one who actually analyzes password strength in terms of Shannon entropy. Cripes, the very first sentence of the Wikipedia page for Shannon entropy tells us that it's an average.

Simple analogy - if the goal was to protect your house from a 9-foot-deep flood, would a dike with an average height of 10 feet do the job?

benwr 4 years ago | |

I've done a fair bit of research into this, and as far as I can tell, the entire internet does this thing you've never seen. For example, https://en.wikipedia.org/wiki/Password_strength#Entropy_as_a... implies the use of Shannon entropy.

bell-cot 4 years ago | | |

[sigh...] +1, though you're making me feel d*mn old.

I won't tell you what decade it was, when I found that some "bright" user had picked his/her own office phone # (10 digits, 2 hyphens) to use as a "high security" password.

My own mental model - with a decent compression algorithm, and compression dictionary pre-loaded with popular passwords and personal information, how many bits would the specific password in question compress to? That also catches the clever folks who pick stuff like "abcdabcdabcdabcd" or "3.1415926535".

dcl 4 years ago | |

Yep one of those cases, where an ensemble average is not at all relevant for describing the situation.

krupan 4 years ago |

When will we stop using passwords?! They are an elementary school kid “secret club” game taken way, way too far. They are totally broken. Nobody can come up with and remember good passwords. Nobody can store passwords securely. 100% busted.

Instead of continuing to debate what makes a good password, we need to put our energy into better techniques altogether! No more shared secrets! Let’s talk about one-time codes, asymmetric key cryptography, hardware tokens, anything but passwords!!

iflp 4 years ago |

Kolmogorov complexity/entropy is more suitable for this purpose, under the implicit assumption that password crackers don't have tailored prior knowledge and are just enumerating "simple" sequences. It only agrees with Shannon entropy on long ergodic sequences. The author basically constructed an example where the two notions don't agree.

canjobear 4 years ago | |

How would you estimate the Kolmogorov complexity for the author's example?

iflp 4 years ago | | |

Kolmogorov complexity is only unambiguously defined asymptotically, and "asymptotics is merely a heuristic". It is also uncomputable. So, to use entropy arguments for passwords, the only correct way I could think of is to generate long and (elementwise) random passwords.

lapinot 4 years ago | | |

By giving the password to a good compressor? (and then computing the shannon entropy of the result) Yet i'm not sure i know a good compressor for short strings... Perhaps something like gpt2tc tailored to passwords instead of english text.

aeternum 4 years ago | |

The implicit assumption however isn't good. Password crackers regularly make use of prior knowledge. A password that consists of a Shakespearian Sonnet for example has very high complexity but makes for a bad password.

lapinot 4 years ago | | |

Kolmogorov complexity kinda does account for "prior knowledge" (that's why it's not computable). A shakespearian sonnet will have low kolmogorov complexity (there's redundancy).

BeefWellington 4 years ago |

There's a bit of a logical flaw here in that the argument is made against average entropy of a set of passwords, rather than individual entropy of each chosen password.

This is an argument I can't find anyone making: an aggregate average entropy of the set of all passwords you use is fine for password security, rather than the entropy of each individual password.

As far as I can tell this seems to be a (possibly intentional?) misunderstanding on the author's part.

adgjlsfhk1 4 years ago |

The real question here is if there are any actually used password strategies where this distinction matters? In practice, no one would ever use the type of password strategy described.

benwr 4 years ago | |

This is a fair question; I've been thinking about "weird" password choice strategies recently, for which it can matter. For example, if you want your password to be an English sentence, choosing sentences based on random parse trees will produce duplicated sentences with ambiguous parses.

teeray 4 years ago |

It’s important to remember that attackers get no information on how close they are (assuming good hashing practices). It is unknowable to them if you went with the correcthorsebatterystaple approach or placed your cat on the keyboard for a few minutes. Given that, a simpler alphabet with longer strings > more complexity with shorter strings.

canjobear 4 years ago |

Cool example. An attacker will take 2^234 guesses on average to guess the password, but that's an average of 19 1's and one enormous number. So the attacker will usually guess the answer quickly. It's kind of like the St. Petersburg paradox in that the expectation value doesn't reflect typical behavior.

Seems like this might be a use case for "dispersion" (the second moment of entropy) [1].

[1] https://math.stackexchange.com/questions/1626522/higher-mome...

MattPalmer1086 4 years ago |

The argument feels like a straw man.

He seems to be saying, if your password selection strategy skews towards really weak passwords, and you measure the Shannon entropy of the distribution, it won't reveal that this is a bad strategy.

I don't know anyone who would actually do this and declare a win "because Shannon".

At best, it's mildy interesting that Shannon entropy on its own isn't going to give you a useful answer if you have a weak strategy.

croes 4 years ago |

I thought it's the entropy of the chosen password not about the entropy of the possibilities of password you could choose

benwr 4 years ago | |

Entropy of a single password isn't actually a well-defined concept; entropy is always about a distribution. "Entropy calculators" that look at your password and tell you "its entropy" are making assumptions about how you chose the password.

We care about the distribution from which you drew the password, because that lets us analyze how difficult it would be for an attacker who knew your password selection process to brute-force the password. Just knowing the password itself isn't enough information to determine that (though of course you can judge how hard it would be for an attacker once you know their brute forcing strategy).

iechoz6H 4 years ago |

I typically use a phrase from my life e.g.

MathsDegree@StamfordWasABigWin [1] RanThroughAPlateGlassDoorWhenTen [2]

with some esoteric obfuscation rules.

1. I don't have a maths degree from Stamford. 2. Did happen, not one of my passwords.

willis936 4 years ago |

Hasn't this problem been solved for decades by diceware?

Use words as your characters with a dictionary of a few thousand words. Assume an attacker knows the dictionary. Make passwords that are too long to brute force (40+ characters). Use enough words that a dictionary attack is also infeasible (4+). Add a salt if you're feeling extra spicy.

Entropy is sufficient if you use the right language model.

pmw 4 years ago |

This is a good place to advertise https://phrase.shop - a webapp I wrote that makes secure yet memorable passphrases.

It makes entropy requirements explicit, and you can even roll your own dice to supply the required entropy to generate your passphrase.

Try it, it's fun!

DarylZero 4 years ago |

It seems like it is still sufficient for passwords that are generated in a normal way.

Bolkan 4 years ago |

https://xkcd.com/936/

voiper1 4 years ago | |

Instructions unclear, password on all sites is now "correct horse battery staple".

Inspired by this, there's a package https://github.com/dropbox/zxcvbn to estimate entropy and give suggestions.

BoiledCabbage 4 years ago | |

Fundamentally is there any flaw with this method? Or a reason why it isn't better than general password approach?

throwawayffffas 4 years ago | | |

You can only remember a limited number of passwords regardless of whether it's a sequence of words or a sequence of random characters. The main flaw in all these schemes is that you have to remember them. The only viable option is to use a password manager.

kbart 4 years ago | | |

It's vulnerable to the dictionary-based attacks that are very common.