Don't conceive your own cryptographic hacks. Use existing KDF designed by professionals.
I've in the past been annoying about saying I think we should just call all password hashes "KDFs", but here's a really good illustration of why I was definitely wrong about that. A KDF is a generally-useful bit of cryptography joinery; a password hash has exactly one job.
Including the username in the hash input gives you guaranteed domain separation between users that you don't get from salts/nonces. Its a generally good idea if you have a hash function with unlimited input size (all modern cryptographic hash functions except bcrypt have unlimited input size).
I'm kind of baffled how they came to use bcrypt for this. Bcrypt is not exactly subtle about only supporting 72 bytes of input. And this is at a company who provides auth as a service; I've got to imagine they had multiple engineers who knew this (I guess not working on that code). Hell, I know this and I've only used bcrypt twice and I'm nowhere near a security/crypto guy.
The naming overlap between the two is bad, so the industry has tried to move towards naming the two differently. Password hashing functions are not ideal KDFs, even though a particular primitive may be secure for use as a KDF. That's a root of some of the confusion.
Also I'm not sure the average developer understands the distinction.
> Also I'm not sure the average developer understands the distinction.
I'm an average developer. I'm not sure that I understand exactly. What should I be reading, or what can you tell me?Thank you!
I remember :-): https://news.ycombinator.com/item?id=42899432
Product - we need to provide service availability even if the AD is down
Engineer - Ok, may be we can store the ~credentials in cache
Security - oh, in that case make sure everything in cache is hashed properly with the recommended Bcrypt algorithm
Engineer - We got the approval from the security, we are in much safer zone, lets deliver and get a win
In addition to (true) KDFs, people often want a HKDF (HMAC-based Key Derivation Function) or hierarchical deterministic key derivation function (HDK).
I’m wondering if okta was inspired by those.
I feel gross calling a function that just blatantly ignores part of its input a hash, much less a password hash. It's like calling a rock a fish, because they're both in water, despite the lack of swimming. In any case, a hash that ignores some of its input is certainly not a cryptographically secure hash, so why is it being used for crypto?
The analogy is something like creating a hash map whose insert function computes the slot for the key and unconditionally puts the value there instead of checking if the keys are the same during a collision. No amount of tinkering with the hash function fixes this problem. The algorithm is wrong. A hashmap should survive and be correct even giving it a hash function that always returns 4.
I co-maintain a rate-limiting library that had some similar rough edges, where it wouldn't always be obvious that you were doing it wrong. (For example: limiting the IP of your reverse proxy rather than the end user, or the inverse: blindly accepting any X-Forwarded-For header, including those potentially set by a malicious user.) A couple years back, I spent some time adding in runtime checks that detect those kinds of issues and log a warning. Since then, we've had a significant reduction in the amount of not-a-bug reports and, I assume, significantly fewer users with incorrect configurations.
72 is the max length of id, username, and password combined. If that combination is over 72, then failure and the cache key would not have been created. So, no, the attacker would not need to guess only one character of a password.
The API was designed to generate a hash for a password (knowledge factor) and for performance and practical reasons a limit has been picked up (72). The chances that some one knows your first 72 characters of password implies that the probably is a lot higher for the abuser to have remaining characters too.
While smaller mistake here in my opinion was not knowing the full implementation details of a library, the bigger mistake was trying to use the library to generate hash of publicly available/visible information
There's nothing wrong with a limit. The problem is that the library silently does the wrong thing when the limit is breached, rather than failing loudly.
As a mistake, it's fine. Everyone writes up things like that. But defending it as an affirmatively good decision is wild.
``` (defmethod derive-key ((kdf bcrypt) passphrase salt iteration-count key-length) (declare (type (simple-array (unsigned-byte 8) (*)) passphrase salt)) (unless (<= (length passphrase) 72) (error 'ironclad-error :format-control "PASSPHRASE must be at most 72 bytes long."))...) ```
Also I'm not sure what functionality the authentication cache provides, but their use of bcrypt(userId + username + password) implies the password is kept around somewhere, which is not the best practice.
OT. Has Argon2 basically overtaken Bcrypt in password hashing in recent years?
That depends on how exactly it was used. If it was simply used to check if previous authentication was successful (without the value containing information who it was successful for) then single long password could be used to authenticate as anyone.
Only if everyone uses the same long prefix for password.
It's great to hear that Zig covered both cases. However, I'd still prefer the opposite behavior: a safe (without truncation) default `bcrypt()` and the unsafe function with the explicit name `bcryptWithTruncation()`.
My opinion is based on the assumption that the majority of the users will go with the `bcrypt()` option. Having AI "helpers" might make this statistic even worse.
Do you happen to know Zig team's reasoning behind this design choice? I'm really curious.
These choices are documented in the function's docstring, but not obvious, nor do they seem encoded in a custom version.
`bcryptWithTruncation()` is great for applications entirely written in Zig, but can create hashes that would not verify with other implementations.
The documentation of these functions is very explicit about the difference.
The verification function includes a `silently_truncate_password` option that is also pretty explicit.
Also neither seems to warn about the NUL issue.
[0] I assume for compatibility purpose, but it still seems very dubious.
If you don't see the mistake, Google 'yaml.safe_load'.
Is there a reason I might be missing?
Hash functions themselves are general purpose and don't protect against low entropy inputs (low entropy passwords). They also don't protect against rainbow tables (pre-calculated digests for common or popular passwords). For password hashing you want something slow and something with unique entropy for each user's password to prevent rainbow attacks.
It doesn't solve the problem of weak passwords, but it's the best that can be done with weak passwords. The only improvement is to enforce strong passwords.
> SHA-2 family of hashes was designed to be fast. BCrypt was designed to be slow.
Slow == harder to brute-force == more secure.
The node crypto module is essentially an API that offloads crypto work to OpenSSL. If we dig into OpenSSL, they won't support bcrypt. Bcrypt won't be supported by OpenSSL because of reasons to do with standardisation. https://github.com/openssl/openssl/issues/5323
Since bcrypt is not a "standardised" algorithm, it makes me wonder why Okta used it, at all?
I remember in uni studying cryptography for application development and even then, back in 2013, it was used and recommended, but not standardised. it says a lot that 12 years on it still hasn't been.
How does a company whose only job is security screw that up so badly?
One of the points of the article is that documentation isn’t enough: one cannot allow callers to misuse one’s API.
While I don't have any answers to this, I've realized that it's an ideal showcase of why fuzzy testing is useful.
On the other hand, I'm not a security developer at Okta.
Silently truncating the data is about the worst way to deal with it from a security standpoint. No idea why that decision was made back in the day.
https://gist.github.com/neontuna/dffd0452d09a0861106c0a46669...
bcrypt::non_truncating_hash()
https://docs.rs/bcrypt/latest/bcrypt/Funnily, TFA later also suggests that such function should exist...
amusingly, the python "library" is just a thin wrapper around the same rust library.
protip: a lot of cryptography primitives actually aren't that complicated in terms of the code itself (and often can be quite elegant, compact and pleasing to the eye). if it's important, probably worth just reading it.
it's what people wrap them with or the systems they build that get messy!
About solutions, Django hashes by default the password only with a salt. I'm not sure why it would be valuable to combine user_id+username+password. I've always assumed that using salt+password was the best practice.
cache_key = sha(sha(id + username) + bcrypt(pass))
with sha256 or something.Is there any security issues with that? I'm a "newb" in this area, so I'm genuinely curious about the flaws with the naive approach
Weird take. Usernames are often chosen by the user. Less so in corporate world but definitely not unheard of
https://www.boehringer-ingelheim.com/media-stories/press-rel... has a camilla.krogh_lauritzen@boehringer-ingelheim.com at 48 characters, for example.
- MUST be non-reversible, including against tricks like "rainbow tables"
- should be somewhat expensive to discourage just trying all possible passwords against a (leaked) hash
KDF is a key derivation function. The value will be used as a key in, say, AES. The important properties are:
- should distribute entropy as well as possible, across the required width of output bits
- reversibility less important as the derived key shouldn't be stored anywhere
- may or may not want artificially inflated cost to discourage cracking
There's either (1) nobody competent enough there to know (which is likely not true, I had a pentester friend recently join, and she is very good), or, more likely (2) management doesn't care and/or doesn't give enough authority to IT security personnel.
As long as clients don't have any better options, Okta will stay this way.
Yes, this is very true.
Also some companies realize that they can screw up royally because they do no have the proper knowledge, and authentication is not a core business of theirs.
I can understand them. I also use mail systems I am not that happy with, but I have this comforting idea that if they have a problem, 3B people are waiting together with me for it to be solved, and that's the kind of pressure that helps.
(It's still got the truncates-at-72 problem with PASSWORD_BCRYPT, though.)
It's valid to assume "it will never happen" for 128 bits or more (if the hash function isn't broken) since chance of a random collision is astronomically small, but a collision in 64 bits is within realm of possibility (50% chance of hitting a dupe among 2^32 items).
Think about it, that's what hashing passwords relies on. We don't store a plaintext password for a final check if the password hash matches, we count on a collision being basically impossible.
A hashmap is different, because it's using a much weaker hash function with far fewer security guarantees.
Plus, you're assuming the original values are even kept around for comparison. The cache key likely just mapped to something simple like a boolean or status flag.
Now, the data structure they're using for a cache will use some sort of hash table, likely in memory, so maybe they've got the 192-bit bcrypt "key" and then that's hashed again, perhaps well or perhaps badly [e.g. C++ really likes using the identity function so hash(12345) = 12345] but then a modulo function is applied to find the key in an index and then we go groping about to look for the Key + Value pair. That part the API probably took care of, so even if the hash has size 6-bits, the full 192-bit key was checked. But the original data (userid:username:password) is not compared, only that 192-bit cache key.
Poor API design can make it easier for other contributing factors (checking cache here, but could also be not running load tests, not fuzzing, human error, etc.) to cause incidents.
I'm glad to see this come out, plus which libraries handle out of bounds conditions with errors vs. fix-up the input to cause silent failures.
Similarly, a developer I worked with once claimed that CRC32 was sufficient verification because CRC32s changed so drastically depending on the data that they were difficult to forge. He was surprised to find out not only is it trivial to update a CRC32, but also to determine the CRC polynomial itself from very few samples.
I've tried to stay positive and explain that they will fool nearly everyone, the technology they used is usually recognizable to the type of people that would want to bypass it for their own gain (or knowledge, assuming white hat types poking around). Usually putting a spin on how they came up with something that looks secure was a great idea, but the type of people that will exploit something like what they built, will recognize patterns easily (and now that AI is around, you could even make them feel better by stating how there is software built to recognize these patterns).
[Im assuming the usual definition of salt where it is known by the attacker... a pepper would be fine]
So something like crypto.danger.bcrypt and crypto.bcryptWithTruncation
bcrypt(longpassword + 123456 + me@foobar.com) = bcrypt(longpassword) = hash1 -> true
If they then try login as you@bar.com using same password there would be a cache lookup:
bcrypt(longpassword + 1111111 + you@bar.com) = bcrypt(longpassword) = hash1 -> true
Like getting an input that is too long? :)
I think a library asserting that the preconditions of its arguments are true is fine.
An error code is risky.
Apps crashing with assets is awfully, but at least it screams at your when you failed to read the docs, target than incorrectly storing users data for the rest of time.
According to the security advisory this cache was for AD/LDAP delegated authentication, so they don't have their own password database with a version field or similar for sensible invalidation.
I guess the requirements could be something like:
- different username/password combinations must have separately cached results
- mitigate a potential data leak by putting all the entropy we have available together with the password material and using a slow password hashing functionI would agree that it should not just be called “bcrypt” though, likely no function of this module should be, they should either explain their risks or clarify their safety.
Or possibly only a version which fails if passed more than 72 bytes or any nul.
I don't like using thought-stopping cliches any more than anybody else does, but this design feels a little cargo-culted. All this stuff follows the more fundamental question of "why is the password mixed into a cache key"?
* bcrypt(SHA-512(PW || stuff))
* SHA(stuff || bcrypt(PW))
Disclaimer: Not cryptography advice.
It's still unclear to me why the password is in there.
bcrypt stores the salt and retrieves it for comparison - otherwise you wouldn't be able to generate a matching hash.
Consider the case where a user has a very long username and sets their password to their userId + username + password thus recreating the scenario which lead to the incident.
If you use only the password to generate the cache key, then this password will match regardless of salt, so users with the same password will generate a cache key matching that password.
The birthday paradox is a thing. If you have 128 bits of entropy, you expect the 50% mark to be proportional to 64-bit keys, not 128 bits. 64 bits is a lot, but in my current $WORK project if I only had 128 bits of entropy the chance of failure any given year would be 0.16%. That's not a lot, but it's not a negligible amount either.
Bigger companies care more. Google has a paper floating around about how "64 bits isn't as big as it used to be" or something to that effect, complaining about how they're running out of 64-bit keys and can't blindly use 128-bit random keys to prevent duplication.
> bits of entropy
Consumer-grade hash functions are often the wrong place to look for best-case collision chances. Take, e.g., the default Python hash function which hashes each integer to itself (mod 2^64). The collision chance for truly random data is sufficiently low, but every big dictionary I've seen in a real-world Python project has had a few collisions. Other languages usually make similar tradeoffs (almost nobody uses crytographic hashes by default since they're too slow). I wouldn't, by default, trust a generic 1-million-bit hash to not have collisions in a program of any size and complexity. 128 bits, even with low enough execution counts to otherwise make sense, is also unlikely to pan out in the real world.
Birthday paradox applies when you store the items together (it's a chance of collision against any existing item in the set), so overall annual hashing churn isn't affected (more hashes against a smaller set doesn't increase your collision probability as quickly).
If only the password is used to generate the hash then that password, when used to match against a previously stored hash(cache key here), will also match it, thus producing the exact same vulnerability, but worse because it's enough to have the same password as someone else.
Salting does not help here at all.
So when you read "bcrypt(password)", that just means the salt is implicit, not that it isn't salted.
$2a$12$R9h/cIPz0gi.URNNX3kh2OPST9/PgBkqquzi.Ss7KIUgO2t0jWMUW
\__/\/ \____________________/\_____________________________/
Alg Cost Salt Hash
The Salt part is randomly generated. When you call `bcrypt.compare(output, password)` it uses the salt that's contained in `output`. Two calls of `bcrypt(password)` will generate different outputs(so different salts and thus different hashes), but still if you run `bcrypt.compare(output1, password)` and `bcrypt.compare(output2, password)` they will both match as long `password` was used to generate both.In short: you can't use just the password as that's going to match a cache key that was generated by whoever typed in this exact password. The salt is only there to prevent offline attacks.
Rather you use the bcrypt output itself as the lookup key. And if you do that then the salt will indeed do what it's designed to do.
The function you used is a convenience function which generates random salt, but you can specify your own as is done in this[1] illustration of the Okta incident.
What bcrypt.compare does is essentially to extract the salt from the provided previous output, compute new hash using that and the provided password, and check that the old and new hashes matches.
As such it's equivalent to comparing the outputs of two different "runs" where the same salt is used (modulo timing attacks).
So if you need to recompute the lookup key then you need to use the same salt value.
[1]: https://kondukto.io/blog/okta-vulnerability-bcrypt-auth
Perhaps they did not want to apply cache invalidation purely by the passage of time, or want that passage of time to be long, but wanted to treat a credentials update as a cache invalidating event. A safer way to implement that would perhaps be to have a concept of a version of an account, incremented when authentication options or other significant properties change, and including that in the cache key.
I'm not sure why it would matter though: even if a credentials change does invalidate the cache from the PoV of the user looking up information, the information is potentially still in the cache so could still be referred to by someone else who has gained knowledge of the old credentials.
(I need that "man standing up in the town hall meeting" meme for this.)
Just use a real KDF, if that's really what you want. I'm still confused what password-derived material is doing in a Redis key.
>The user previously authenticated creating a cache of the authentication
Maybe, it's a password encrypted secret token.
An authentication company should have known this...
The SHA-3 family has "extendable-output functions," which can ostensibly be used to generate unlimited numbers of bits (albeit with only a given security level). These are new to SHA-3.
None were trained in security principles. None were security experts. And there was no security review.
That's the assumption everyone makes and it's dangerous.
The fact that someone or some entity does something and only special doesn't make them the best at it (or even close). It's just what they do to survive (and earn).
For businesses, cybersecurity is (like everything else, ultimately) about minimizing costs related to digital threats. That is - any threat scenario can be modelled as T*D, where T is "how likely it's going to happen (per year)", and D is "how much it'll cost us when it does (per incident)"; the result is the expected yearly loss, denominated in dollars. The less of it you have (integrated over all scenarios you can think of), the better, but prevention and mitigation also cost money, so what you're actually minimizing is the (expected loss + mitigation costs); i.e. makes no sense to spend more on improving something than it'll save you.
The reason for this exposition dump is: actual security at the technical level is one of many ways of improving TxD, and usually is neither cheap nor the most interesting one. It's also mostly focused on the "T side" (minimizing risk of an incident), which is harder to move than the "D side" - reducing impact.
The service an authentication company is selling is not "cryptographically unbreakable authentication". What they're selling is roughly: "low-T auth sytem cheaper than you could build&operate yourself + if it breaks it's our fault". That is, more than lowering "T side", they're offering to let you shift part of the liability to them, which significantly lowers "D side".
Internally, how they do it is up to them. But there's only so much need for technical security experts - you obviously can't sell a broken system (everyone has to at least pay a lip service to real security, otherwise people get angry, politicians get interested, and costs start to multiply rapidly), but eventually, it's cheaper to focus on your ability to take on liability from your customers and discharge it somewhere else, which involves improving operations, customer service, etc. - all the stuff you need regular, non-security-expert programmers for.
Note the bit about discharging liability. After working in cybersec and GRC for a bit, I realized security is best understood in terms of managing liability (which corresponds to minimizing the D part of TxD from earlier). That's the primary product of most security service companies, as well as security frameworks and associated compliance audits. They do improve the technical side somewhat too, but that's not why those things are bought. They're bought so, when something happens (something eventually always happen), you could point at the SOC.2 audit results and a string of security contracts and say, "we've followed all the best practices, there was nothing more we could do; crime happens, not our fault" - and have the liability flow down the contractual agreements to other companies, which do the same, until, like spring rain flowing down the mountains, into rivers, into sea, it all gets turned into insurance payouts in the end :).
Might sound cynical, but it's probably the right and reasonable thing to be happening 90% of the time. Shit happens, criminals be criminals, opportunity costs are real, etc.