<< The maximum password length is 72. >>
So if the userid is 18 digits, the username is 52 characters, and the delimiters are 1 character each, then the total length of the non-secret prefix is 72, and bcrypt will drop the secret suffix.
You aren’t supposed to put more than the salt and the password into trad unix password hashes.
Core issue (okta's approach):
* They concatenated userId + username + password for a cache key
* Used BCrypt (which has a 72-byte limit)
* The concatenation could exceed 72 bytes, causing the password portion to be truncated
Why this is problematic: * BCrypt is designed for password hashing, not cache key generation
* Mixing identifiers (userId, username) with secrets (password) in the same hash
* Truncation risk due to BCrypt's limits
Password storage should be separate from cache key generation. Use a random salt + appropriate hash function and for cache keys - use HMAC or KDF w/appropriate inputs key = anyhash(uuid+username)
if (result := cache.get(uuid+username)):
if hash_and_equality(password, result.password_hash):
return result.the_other_stuff
# try login or else fail
Of course, if you have any validness of old sessions / passwords around a password change, you are doing something wrong.
My personal wondering is, considering KDF is meant to be expensive, why is IO more expensive to the point it needs a cache?
> why is IO more expensive to the point it needs a cache
The advisory mentions it's only exploitable if the upstream auth server is unresponsive. So it seems to be mainly for resilience.
At least that's my immediate thought, could be wrong.
The main attack vector these days is GPU-based compute. There, SHA* algorithms are particularly weak because they can be so efficiently computed. Unlike SHA algorithms, bcrypt generates high memory contention, even on modern GPUs.
Add in the constraint of broad support, low "honest use" cost, and maturity (extensive hostile cryptanalysis), bcrypt stays as one of the better choices even 25 year later.
That said, bcrypt's main limitation is it has a low memory-size cost. There are some newer algorithms that improve on bcrypt by increasing the memory-size cost to more than is practical even for FPGA attacks.
More importantly, bcrypt didn't actually fail here. The vulnerability happened because okta didn't use it correctly. All crypto is insecure if you use it wrong enough.
i understand performance concerns and design trade offs, but i would expect a secure hashing function in 2024 to do proper message scheduling and compression or return errors when truncations are happening.
i suppose 90s culture is hip again these days, so maybe this does make sense?
2024-07-23 - Vulnerability introduced as part of a standard Okta release
This issue is not an "okta is old" issue. this was new code written in 2024 that used a password hashing function from 1999 as a cache key.
To be fair, they're basically salting with the userid and username. Still unorthodox to be sure.
If the salt is externally known, which the username and userID necessarily are, then the rainbow table for that account can be computed entirely offline, defeating the point of salting.
It's also possible to build a rainbow table when you already know an account is high-value and have the salt. You can't go download that rainbow table - you'll have to compute it yourself, so the cost to the attacker is higher. But if the account is valuable enough to justify targeting specifically, you'll do it.
Salts are not intended to be secrets.
If you want to treat a salt as if it was a private key, that would only provide additional protection for the very specific circumstance where the user hash is compromised, but the corresponding salt was not.
So you basically bruteforce the password for a specific account before you get the actual hash but after you know the hashing scheme? I don’t see how this helps with any sort of attack though.
You can test this for yourself by creating a user account, then editing the master password database and manually changing the username without recalculating the password hash. The password will still work. If the username was part of the hash input, the password would fail.
Though you can salt a hash using a function that does not take a distinct salt input by just concatenating the salt with the value. This is a relatively common practice, but of course only works if there is no truncation of the salted input.
Username: x@example.xyz.com Password: /!@#$% Concatenated: x@example.xyz.com/!@#$%
Which is very easy to do without losing any desired functionality as opposed to delimiters in the ASCII character range.
> and you need to consider possible truncation scenarios
In particular hashing libraries worth using never have this problem.
> and make sure they won't cause silent failures at any point.
They literally only need to exist in the data to one function call. Afterwards they are not needed or significant.
One pattern I bump up against from time to time is the delta between using a perfectly defensible technique for a given use-case (safe delimiters when constructing an input for a specific function) versus a desire to have each decision be driven by some universal law (e.g. "if you're streaming data between services, using null bytes as delimiters might not be safe if consuming services may truncate at null bytes, so NEVER use null bytes as delimiters because they can be unsafe")
It's not even a matter of one "side" being right or wrong. You can simultaneously be right that this is perfectly safe in this use-case, while someone else can be right to be concerned ("need to consider possible") because the code will forever be one refactor or copy/paste away from this concatenated string being re-used somewhere else.
I'll note that the reason we're here in the first place is that they were using a password hash library with a completely unacceptable API.
Also, we're talking about user_id not user_email, so it should be the same length always. Well, unless you're silly and using databases sequence for IDs.
It's like complaining about how dangerous an axe is because it's super sharp. You don't complain, you just don't grab the blade section, you grab it by the handle. And
Not letting people use your API incorrectly is API design 101.
To be clear this is not the fault of the bcrypt algorithm, all algorithms have their limitations. This is the fault of bcrypt libs everywhere, when they implement bcrypt, they should add this check, and maybe offer an unsafe_ alternative without the check.
Can't tell if it's issue with BCrypt or with the state-data going into the key, or combo-cache lookup tho.
Go fuck yourselves.
Sincerely, Everyone in the industry
-- With Love, Okta.
Those compliance companies are (mostly) all just checking a box. It's (mostly) security theater from people who wouldn't know security if it bit them in the nether regions.
Even if that wasn't true, there's probably no box in any compliance regime that says "Yes, we loudly promulgate our security failures from the nearest rooftop on 10am on a weekday" (and it's always five o'clock somewhere, right?)
If it helps (I know it doesn't), the Executive Branch likes to do this with poor job number revisions, too, lol
IMO, better to choose point solutions and combine them.
It’s was a fuzzer of some sort
What's your point? That rewriting `bcrypt` in something else magically fixes this?
AIUI, the issue is that `bcrypt` only uses the first 72 bytes of the input to create a hash.
It's like using a flat-head screwdriver as a hardwood chisel and then the handle breaks off after the third strike.