I mean, why not tell everyone our password hashes?(theobsidiantower.com) |
I mean, why not tell everyone our password hashes?(theobsidiantower.com) |
Here's an md5 sum of a not-that-great password I just made up. It's 14 characters long, but has plenty of guessable features. Is it crackable?
1cf016ea3cb1f2aa2ccb59c196d0e704
If such a company's database of hashed passwords is leaked, then an attacker doesn't even have to crack the hashes - the hash itself is a valid version of the password. Yet I've seen this behavior at multiple companies; only one of them pushed back against my request to remove that "feature", and I didn't stay with them much longer after that.
Even if you took a small percentage of the IP addresses in Europe, this could have a snowball effect. You take the IP addresses belonging to a popular mail service used by other domains, then you use admin email addresses to reset and eventually Europes internet is stolen.
In order to "steal" IP addresses (get them routed to you) you would need to buy a connection to at least one exchange point, probably several if you want all the traffic for the target to route to you and not just some traffic from some networks. You'd need to buy rackspace somewhere with a connection to the exchange point, install routers, establish BGP peerings with the exchange point (if they're doing route reflection) or with all the other major networks at the exchange.
There are multiple steps along the way where humans would look at the prefixes you were going to be announcing. This would include looking them up in RIPE, but anything more than a cursory inspection would likely reveal your ruse.
At this point it becomes more of a social engineering attack, and even if you got as far as announcing it, there are things like BGPMon that would pick up the fraudulent announcement pretty quickly and you'd likely find that the cable was pulled out of your router pretty fast.
That ending was an incredibly well delivered stab at Deutsche Telekom. This is why I love vigilante security.
john --test --format=nt
Benchmarking: NT [MD4 128/128 X2 SSE2-16]... DONE
Raw: 29037K c/s real, 29037K c/s virtual
john --test --format=bcrypt
Will run 16 OpenMP threads
Benchmarking: bcrypt ("$2a$05", 32 iterations) [Blowfish 32/64 X3]... (16xOMP) DONE
Raw: 5472 c/s real, 490 c/s virtual
Edit: NT hashes are one round of MD4. These are Microsoft Active Directory hashes. OpenBSD uses Blowfish hashes by default.https://www.ripe.net/manage-ips-and-asns/db/support/updating...
>"I hope like me you were immediately drawn to the ‘auth’ fields. As the name implies this field contains authentication information for controlling this object in the RIPE database. RIPE supports a couple of different auth types like Single Sign On (SSO), public key cryptography, and of course md5."
It's authentication to manage that entry in the RIPE database.
Single unsalted broken MD5 is a far cry from scrypt... and even scrypt is probably a bad idea with all this crypto currency hashing hardware out there, unless you have a seriously strong password.
Just don't publish hashes.
These hashes are not unsalted MD5. They are md5crypt ($1$[salt]$[hash]), as found in many Unix-likes and some Cisco IOS.
You can regenerate a rainbow table which uses that salt, but you'd have to generate a rainbow table for every password, since each password has its own random salt. I don't know how rainbow tables work exactly, but I'd assume an old fashioned brute force attack or dictionary attack is cheaper than making a rainbow table for each password.
Wait, was that just a straight bash command? Is this installed on my computer?
>$ whois usage: whois [-aAbdgiIlmQrR6] [-c country-code | -h hostname] [-p port] name ...
Holy shit lol, that's neat.
Welcome to 1982..1985. This command predates bash.
https://tools.ietf.org/html/rfc812
http://minnie.tuhs.org/cgi-bin/utree.pl?file=2.11BSD/src/ucb...
So hash, unless properly salted, works works very fine.
Many people actually use a single password everywhere. Or at least for similar things.
Of course, you could still crack some (problem), so keeping multiple secrets hidden through obscurity (the hashes, the salts, etc.) is another layer of security.
This doesn't guarantee security, but it's certainly more secure. But it is additive: there's no reason to just use MD5 (or plaintext) because "my hashes are secret".
https://developer.github.com/v3/users/keys/
This is a different situation and public keys are not directly analogous to password hashes: there isn't a reliable way of cracking public keys in the same sense that there's a semi-reliable way of cracking hashes. But it was still strange and uncomfortable to me that they would reveal this "target" (and if there were specific key generation bugs, like RNG seeding errors, people might actually be able to crack a few of them and know that they had suceeded).
Relatedly, I was thinking about the magic crypto-cracking device in the movie Sneakers. Once they had it, they could immediately use it to log on to random network-connected services, defeating the authentication. So, how is that supposed to work? How do they automatically know what credentials would be accepted for a particular service? Are there common network authentication protocols based on public-key cryptography that have the property that the verifier tells the prover the public keys that it trusts?
That is a terrible idea because agencies like the NSA or GCHQ with unfathomable resources and techniques will crack them and never tell anyone. Then you'll have a compromised account, the provider won't know, the user won't know. Then the agency would be able to compromise the account a publish whatever they wanted as that identity.
Given there are tricks to mask an IP address, or they straight up tap the wires, that's a #1 way to character assassinate any dissident or someone who they dislike.
> This is to get rid of the fiction that these are ever private and to eliminate an incentive to break in.
And why do you assume criminals wouldn't also try to gain access to the systems? Passwords aren't typically the valuable information in a system, they're there to protect the more valuable data.
As opposed to the current situation where they can just get the info from Facebook/Google/etc. directly? At this point you may as well assume state actors have access to anything you put on the internet.
One, an agency with truly unfathomable resources and techniques is going to be able to get into your network even if you don't post the hashes publicly.
Two, all information we have (e.g., the Snowden leaks) implies that NSA/GCHQ/etc. are at best only slightly ahead of academia in terms of cryptanalysis. The only real mathematical revelation we had is that they did in fact deliberately compromise Dual_EC_DRBG, which the academic community had suspected almost since the standard was introduced, and which didn't even use any mathematics unknown to the public (the academic community knew how to build similarly back-doored systems, which is how they recognized such a system). It turned out that they had focused more on identifying and exploiting operational weaknesses (see also, "I Hunt Sys Admins") and not on discovering cryptographic attacks that the public didn't know about - so, again, they're already on your network.
Three, and most importantly, I'm in the US. I'm subject to the laws of the US. The US government is outside of my threat model, because they can just send me a national security letter whenever they want, and I can't tell my users. Or if they don't want to do that, they can just plant a mole. I certainly neither interview sysadmins well enough to tell if they're secretly working for the government, nor have I been interviewed as a sysadmin well enough for anyone to tell, either. (Remember that the mole could be an actual government employee who believes what they're doing is right, or just a smart kid who took a plea deal for buying some nootropic on the dark web.)
My threat model is everyone else. If the government wants to ruin one of my customers' lives, they can already do that, they don't need to hack me. My threat model is the mass media, my customers' abusive exes, random extortionists in Eastern Europe or somewhere paid by cryptocurrency, bored teenagers whose sense of morality hasn't yet developed to realize that SWATting people is a problem, etc.
Designing secure systems to be secure against the NSA is an extremely hard problem, and if you focus on solving it, you're very likely not to design systems that are secure against the actual attacks your users are at risk from.
Chances are they already have 'em, from a compromised employee, a zero-day exploit, or a SQL injection hole. Far more likely than them having cracked bcrypt.
I can throw this into a structure indistinguishable from a blockchain if any VCs want to invest ;)
There's probably some reason it wouldn't work. Dictionary attacks are an obvious possibility; if your password is "password" the only thing you're depending on is nobody being able to get at the hashes. It might also expose password reuse, though nonces/salts might solve that. Hrm.
This smells a bit like public crypto - public database of public keys (hashes), on login you're challenged to produce proof that you have the private key (the password), and the transformation provides you a means to do that without exposing the private key itself.
Really it points to the idea that we should be moving in that direction for auth. Here's one project I've heard about: https://www.grc.com/sqrl/sqrl.htm
but something like this is strengthened through password stretching. I think this is good practice anyway as it makes them much harder to brute force/ dictionary attack if the data compromised.
1. Password hashing functions are not regular hash functions run multiple times. This is not only false at a macro level (i.e. we don't just run SHA-2 several times to get something resembling PBKDF2), it's false in terms of core construction. Password hashing functions rely on fundamentally different mathematical properties than regular hash functions. It's not like 3DES and DES: a secure password hashing function requires more than just a higher iteration count.
2. "Key stretching" does not refer to running cryptographic hash functions multiple times. Key stretching refers to the act of generating a secret key from an otherwise weak passphrase or input, generally supplied by a user. You use the user's passphrase to (in effect) seed a function that outputs something much more resistant to brute-forcing. Key stretching is used in key derivation functions, but what you described is not key stretching.
3. General purpose (as opposed to password) hashing functions are designed to be fast, not slow. Take a look at BLAKE2's homepage for speed comparisons - speed is a selling point: https://blake2.net/. In addition you can read the following from the handy FAQ:
You want your hash function to be fast if you are using it to compute the secure hash of a large amount of data, such as in distributed filesystems (e.g. Tahoe-LAFS), cloud storage systems (e.g. OpenStack Swift), intrusion detection systems (e.g. Samhain), integrity-checking local filesystems (e.g. ZFS), peer-to-peer file-sharing tools (e.g. BitTorrent), or version control systems (e.g. git). You only want your hash function to be slow if you're using it to "stretch" user-supplied passwords, in which case see the next question.
ssh-keygen -q -t rsa -b 4096 -N "passphrase" -C "mygithub@someaddress.org" -f ${HOME}/.ssh/.ghub
then in your ${HOME}/.ssh/config IdentitiesOnly yes
Host github.com
Hostname ssh.github.com
Port 443
User git
IdentityFile /home/username/.ssh/.ghub
ForwardAgent no
Not that it matters in this case, just sayin'.The only bad thing about the GitHub issue there is that a de-anonymization attack is possible as an SSH server will tell you if it accepts a given public key... if, say you had the same SSH key on your GitHub account and a server you wanted to keep private, this could be bad to say the least. And SSH clients offer every id_* key to every server they connect to, so if you connect to an untrustworthy server, even over an anonymity network like Tor your client may offer a key that identifies you (use your ssh config!).
Could you elaborate more on this specific attack?
https://github.com/<user>.keysThe NSA is going to avoid the former as much as it can, because there is a huge chance they get burned in some way. Anything that they can passively slurp is a huge win for them.
It's literally two passes that are memory independent, then two that are memory dependent, when r = 4.
Seriously though, that's the first un-ironic reference to SQRL that I've ever seen.
Basically, don't reuse keys in places where you might not want to be identified and use ssh configs to prevent announcing all keys to the world.
As you say, public keys were designed to solve a key distribution problem. Inherent to that problem is the idea that a public key could become, well, public. They solve that problem very well, and there is no intrinsic reason why you shouldn't just publish them because they were intended to be defensible against that very eventuality.
Practically speaking I disagree that GitHub has done anything wrong here - changing habits to diminish the publish-ability of public keys because the SSH protocol exhibits suboptimal behavior encourages further lazy security for the SSH protocol.
We shouldn't tap dance around an SSH-specific problem by claiming that public keys need to be kept secret. That's absurd, we already have private keys. Moreover, it is detrimental to other protocols that rely on publicly verifiable signatures and nonrepudiation to adopt this sort of perspective.
But Github is using public key cryptography as implemented in SSH - if that has a failure, Github should take some blame for not working around it, especially when they are going out of their way to expose data that has little benefit IMO.
Anyway, SSH is orthogonal to one of my points, which, phrased another way, is that publishing the link between two identities (the key itself, and the key-owner's Github profile) without consent or need is unethical because it violates the privacy of the owner. I believe there is precedent in the PGP world (e.g., "I believe it's poor etiquette to upload someone else's key to a keyserver as you deny them that choice."[0])
I sort of get the "detrimental to other protocols" and "lazy security for the SSH protocol" points, but when you talk about publishing public keys, do you acknowledge a difference between "key XYZ is in use on Github" and "key XYZ identifies user ABC on Github"? I'm saying the latter is unwise and unkind, and it would be even if the SSH protocol didn't have this particular failure.
> SSH client: I support key auth
> SSH server: Let's use key auth
> SSH client: Do you take this public key hash: XXXXXX?
> SSH server: Yes I do
or
> SSH server: No I don't
Repeat for as many keys as you like.
You can therefore grab a list of known public keys for a given person and ask a given ssh server if it knows about the given public key. Given a few days you could even scan the entire IPv4 space for servers taking a given public key. Username must match, etc of course, but it's an attack many people might not consider.
The difference occurs mostly when you start chaining hashes. In that case, a salt is only relevant in the first hash, whereas the keyed hash needs the key at every hash round.
I thought the two schemes were conceptually different, leading to different engineering tradeoffs: With salts, you assume the attacker can gain access to it. With keyed-hashing, you simply have a second piece of equally-secret information, and you hope it doesn't get leaked.
Why risk it when generating an ed25519 or rsa4096 keypair is cheap?
Yes, except that a 4096 bit key is not just "one more bit", it's double the amount of bits.
> Someone who is be able to crack 2048 bit keys, probably also has the opportunity to crack 4096 bit keys
No, it would require an impossibly large amount of effort to crack 4096 bit keys compared to 2048 bit keys.
> Smartphones and embedded devices want to use as less energy as possible
They can use ed25519 then.
> With an 4096 bit key, you force your communication partners to spent an unnecessary amount of energy.
They spend more energy by running ad-ridden "apps" and electron monstrosities.
There's a Wikipedia line that could use your input (unless I'm missing why this would still be accurate).
> Key stretching functions, such as PBKDF2, Bcrypt or Scrypt, typically use repeated invocations of a cryptographic hash to increase the time required to perform brute force attacks on stored password digests.
https://en.wikipedia.org/wiki/Cryptographic_hash_function#Pa...
This is not false, it is indeed one of the techniques that they use.
I believe this paper might be the origin of the term(correct me if I'm wrong.)
I thought it was a good read if anyone else is interested:
https://www.schneier.com/academic/paperfiles/paper-low-entro...
However, any possible password with a standard printable ASCII character set will typically be found in Rainbow tables up to 10 characters long making expensive cracking unnecessary. [not quite right see edit]
Rainbow tables are just giant tables where the key is the hash and the value is the string that generated it.
However, your example being 14 characters long is a bit long to be in most readily available rainbow tables.
This is why using salts and peppers are incredibly important regardless of what hash you use.
Edit: minor(ish) correction to the previous sentence. Full alphanumeric with punctuation and digits is available readily in smaller password lengths but the 10 character long datasets seem to be mostly only lower case characters and digits.
Really? Storing every possible 10 character long printable ASCII password plus its MD5 hash would require approximately 1.5 zettabytes[1].
[1] 95^10 * (16+10)
Rainbow tables are a tradeoff between storing every hash, and generating them during cracking. You get to pick how much space you want to spend to speed up cracking.
Umm what? Even assuming a limited set of ASCII i.e. Base64, on what magical medium do you suppose a 64^10 rainbow table is stored?
For example, A rainbow table might use chain lengths of 10,000. This means that for every 10,000 hashes calculated, only 1 (really 2) are kept. Each chain ends up as a row in the table, which is then sorted. When cracking, the target hash is hashed and reversed up to 10,000 times looking through the table.
The more compression the less space needed, but longer look up. The original Windows XP rainbow table cracking CD published along with the Rainbow table paper was only ~500Mb, but was able to crack pretty much every windows password.
This isn't nation-state level cost. Individuals could afford this level of hardware. Many individuals have access to systems of this size, for example through botnets, schools, spare junk in the local IT department closet, etc.
It's very reversible.
Call it even ~30^14 / 348 billion per second = 1,374,416,379 seconds. So, they can break passwords with some pattern to them, but not really brute force em.
It's also an unsalted hash, so you could brute force an unlimited number of passwords at the same time without additional resources. Someone with a budget of a few million dollars could break every password in the world in a month.
So in other words, definitely don't publicize unsalted MD5 hashes of your passwords.
Not surprised why no one tried yet.
If someone can point me towards the tools and how to set it up, I'll leave my gtx1070 at it overnight and see.
If you look at the current cracking benchmarks of GPUs (https://gist.github.com/epixoip/a83d38f412b4737e99bbef804a27...), there is an easily quantifiable difference between bcrypt and MD5: 21 bits. (https://www.wolframalpha.com/input/?i=log2(200*%5E9)-log2(10...)
That means under current GPU architecture, bcrypt is basically like "adding 3-4 characters (or 1.5 diceware words)" for free to your password. Can you basically just add 3-4 characters to your password? Sure, but not without user friction, and certainly you can't think that way as the developer of the system, because you're trying to give a small leg up to even the most vulnerable by salting and bcrypt/PBKDF2/Argon hashing.
What about theoretical limits? Well, there is another way to approach this: Landauer's principle (https://en.wikipedia.org/wiki/Landauer%27s_principle), which considers the theoretical minimum energy of a bit flip of information - so this even covers future computing technologies. Even if you used up all available mass-energy in the entire sun, it is only theoretically possible to perform 2^225.2 operations (https://security.stackexchange.com/questions/6141/amount-of-...). 225 bits of entropy is roughly a 35-character (printable ASCII) password.
(Note that you can't do this with MD5 - it has only a 128-bit hash space, before preimage attacks, the best of which lowers it to 123 bits).
So the lesson is: use slow hashes to give some protection to the vulnerable and people whose password complexity is "on the edge". Use a password manager so that the rest of your passwords can be comfortably > 128 bits in complexity, without reuse. And then forget about passwords because after that, every other part of the security system becomes more important.
I haven't used it, but FAQs, Forums, wikis, and tutorials are all out there.
Note: only "In the third quarter of 2016, approximately 144.6 million hard disk drives were shipped worldwide" aka something like all HDD ever produced might fit that much data.
PS: Plus that 30 was low balling for a full search space it's 26 (lower case letters) + 26 (upper case letters) + 10 (numbers) + some number of special characters. So, ~100^14 or ~20,907,515x as large aka 10^17 TB.
No storage, other than the hashes you're attempting to reverse -- which for a data dump from even a large site like Yahoo wouldn't be very large at all. Megabytes.
While no extra resources might not have been strictly accurate, the lookup would be practically free compared to the time it takes to compute the hash.