Show HN: SHA-256 explained step-by-step visually

Show HN: SHA-256 explained step-by-step visually(sha256algorithm.com)

1241 points by manceraio 4 years ago | 137 comments

oconnor663 4 years ago |

Oh this is great. When we taught SHA-256 last semester, we linked to this YouTube video: https://youtu.be/f9EbD6iY9zI. Next time we do it, we'll probably link to both. Having several different ways to visualize the same thing is very helpful, and I like that this one moves quickly.

A couple of details missing from this visualization are how you pad a message to be a multiple of the block size, and how you chain blocks together to form a longer message. In the pseudocode at https://en.wikipedia.org/wiki/SHA-2#Pseudocode, that's the "Pre-processing (Padding)" part and the "for each chunk" loop just below it. I get why you'd want to leave those things out, since they're not really the interesting part, and the screen is already pretty packed as it is.

If anyone's feeling curious about implementing this yourself, take a look at these project notes: https://github.com/oconnor663/applied_crypto_2021_fall/tree/.... At some point I'll clean that up for public consumption, but for now just ignore the parts about grades and cheating :)

manceraio 4 years ago | |

Thanks for the feedback and I am glad you'll use it for teaching (which was the main goal of this project)! The padding part it's briefly explained on the "dynamic" notes on the left column, but yes, can be improved. Typing on the input gives you some sense of what is doing on the background, specially if it jumps to two blocks.

The "for each chunk" is also implemented (which was one of the most difficult parts to synchronize with the UI), but I agree too, I should come up with some way to represent it better. Thanks again :)

fragmede 4 years ago | | |

Minor nit: input could also take hex.

miki725 4 years ago | |

Was about to reply with the link to the project. If anyone is curious about sha2 highly highly recommend to go thorough the project. Jack did an amazing job explaining everything step by step. Writing the code really helps to understand all the concepts much better.

Drdrdrq 4 years ago | |

Thank you for the link to your repo, this it the first time I heard about length extension atracks. TIL, appreciate it! This SO answer explains them nicely, if anyone is curious: https://crypto.stackexchange.com/questions/3978/understandin...

Dowwie 4 years ago | |

What course did you teach?! Have you got a syllabus?

oconnor663 4 years ago | | |

Applied Cryptography (CS-GY 6903) at NYU Tandon. You can find all our programming problem sets in the same repo: https://github.com/oconnor663/applied_crypto_2021_fall.

picture 4 years ago |

So, how do people come up with these things? I assume every aspect of the design is carefully considered to defend it against various attacks. For example, why "right rotate 7 XOR right rotate 18 XOR right shift 3" and not "right rotate 2 XOR right rotate 3 XOR right shift 4"?

taviso 4 years ago |

That's really cool. I made a terrible one for SHA1 years ago, yours is 1000x better.

https://lock.cmpxchg8b.com/sha1/visualize.html

I read a paper at the time where someone described a tool they made to find a near-collision, they explained they were just flipping bits and visually observing the affects. That sounded kinda fun, but they didn't release it, so I tried to replicate it from their description!

dragontamer 4 years ago | |

Your tool seems to understand the gist of differential cryptography better though.

You can track a 1-bit change or 3-bit change to "M" and see how it propagates through the SHA256 rounds in your tool.

----------

So your tool is probably better at understanding the underlying design of SHA2. We know that SHA2 was created well into the era of differential-analysis for example, so the designers would have inevitably done analysis similar to how your tool works.

mabbo 4 years ago |

Before watching this: "Why can't cryptographers just figure out some tricks to crack these hash algorithms?"

After watching this: "How can any cryptographer EVER figured out any trick to crack these hash algorithms?!"

mynameismon 4 years ago | |

> "How can any cryptographer EVER figured out any trick to crack these hash algorithms?!"

More so, "How can any cryptographer EVER figure out any trick to come up with these hash algorithms?!" Seriously, they are incredibly impressive mathematical algorithms. Even coming up with an algorithm that is able to show the avalanche effect is mind boggling. To make sure that the algorithm is not biased to a set of samples AND shows the avalanche effect is tremendously mind blowing.

sammyo 4 years ago | |

They can't. But there are certainly unsavory actors that are able to trick a certain percentage of the unsuspecting to just give them a private key.

selestify 4 years ago | | |

They actually can, in some situations: https://security.googleblog.com/2017/02/announcing-first-sha...

userbinator 4 years ago |

This reminds me that I've always wanted to make a huge interactive combinatorial circuit that computes SHA-256 and shows all its internal state, then put it on a site with the claim that anyone who can make its output match a certain clearly-constructed value (e.g. 0123456...ABCD...) will win a prize. No mentions of hash algorithms or other such phrasing to deter anyone. I wonder how many people would try such a "logic puzzle", how much time they'd spend on it, and if we might even get the first successful preimage attack from that.

jjeaff 4 years ago | |

I think making a problem more accessible like that is the fastest path to a solution.

It reminds me of the Stephen J Gould quote:

"I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops."

manceraio 4 years ago | |

I actually started this because my first idea was: can I implement SHA-256 with just TTL logic gates? which should be possible, but it would take months to do.

for puzzles try 12037465 for some coffee :)

nayuki 4 years ago |

My way of explaining step by step visually is by implementing in Excel: AES https://www.nayuki.io/page/aes-cipher-internals-in-excel ; DES https://www.nayuki.io/page/des-cipher-internals-in-excel .

Also relevant: https://www.righto.com/2014/09/mining-bitcoin-with-pencil-an...

y42 4 years ago |

There also exists a written description showing the process in Python, step by step, which I consider more helpful, because you do not need to stop and play the video.

https://nickyreinert.medium.com/wie-funktioniert-der-sha256-...

edwnj 4 years ago | |

not everybody speaks barbarian sir

y42 4 years ago | | |

Python that bad?

p1mrx 4 years ago |

Can it be proven whether values of m exist such that SHA256(m) == 0?

If I were omnipotent and wanted people to believe in me, I would write a book that hashes to 0, so that anyone could verify its authenticity.

userbinator 4 years ago | |

One way to prove it might be to actually find the "null hash"; turn it into a challenge/puzzle, don't mention hashing or crypto or that it's really hard, and let all the bored people of the world play with it. Perhaps someone might notice something that all the highly-trained cryptographers have missed all along: https://news.ycombinator.com/item?id=30245419

orangepenguin 4 years ago | |

Maybe I'm being incredibly naive, but it seems like this would be trivial. Can you just start with the output hash and then essentially run the algorithms backwards? Obviously the resulting "input" would be random-ish garbage, but it seems like if all you care about is the output, you can pretty much just "pick" any data for the last step that produces the output. Then do likewise for the step prior, and so on.

arcastroe 4 years ago | | |

As a comment above stated, part of the "input" is the initialized values:

> Initialize hash value h0 to h7: first 32 bits of the fractional parts of the square roots of the first 8 primes 2..19).

My guess is h0 to h7 change throughout the algorithm. If you perform each step in "reverse" as you suggest, "picking" any input at each step that produces the required output for that step, then you may not arrive to the correct initial state with the square roots of the first 8 primes.

You'll arrive at "random-ish garbage".

y42 4 years ago | |

Sure, that would be a pretty high difficult factor, but its possible.

p1mrx 4 years ago | | |

But is it provably possible for SHA-256?

An n-bit cryptographic hash function should ideally cover every n-bit output value, given slightly more than n bits of input, but I don't know whether this has been proven for any real-world functions.

picture 4 years ago | |

You could in theory also get that with a lot of computation of course

ypcx 4 years ago |

Similar project which visualizes SHA-256 into terminal: https://github.com/in3rsha/sha256-animation

spdebbarma 4 years ago |

This comes to my attention at a really convenient time. As a teenager, I initially got interested in Computer Science due to cryptography. Over a decade later, I've gotten into the subject for the first time since then.

For the last few days, I've been writing my own encryption for fun even though it's 100% not secure enough or powerful. My belief is that even though it's not super useful, the experience of attempting to write one is teaching me a lot more than I would have by simply studying it.

zahllos 4 years ago | |

Rather than write crypto, what I'd actually recommend is to break it. Schneier put together a block cipher cryptanalysis course a long time ago and while I don't usually recommend his crypto books these days, the course is good: https://www.schneier.com/academic/archives/2000/01/self-stud... (in this case, his crypto book might actually be useful, because it documents some of these out of date ciphers. There's (was?) a mistake in the DES code though iirc).

It is essentially a tour of all the early cryptanalysis literature, complete with suggestions of ciphers to target (e.g. FEAL). This will give you a sense of how the attacks work. Many of the ciphers are old, but I wouldn't let that put you off.

The study technique for this would be a) implement each cipher with toggles to control rounds and any other features, then implement attacks. Most of the papers should be open access by now since the 'course' was written in the year 2000. You could also 'catch up' on the constructions and attacks that have come out since.

I would caveat this with: what I am advising is purely for potential interest. Bear in mind there is little need to implement new block ciphers these days (what I'm saying is: this is a very specialized skill and most people won't find jobs in it).

brk 4 years ago |

How long before we see this website as the source for some "hacker sequence" in a movie where a person wearing a black hoodie states they are "... working on cracking their SHA-256 encryption, should only take a sec."

can16358p 4 years ago | |

If such a movie even mentions SHA-256 I think that's above average on its own.

breakingcups 4 years ago | | |

That depends..

"Just a second, I need to backdoor the SHA256 rootkit to penetrate the directory. Shit, they have an X62 firewall. Luckily I brought my pentester."

reincarnate0x14 4 years ago | |

Syntax-highlighted ALGOL-y code blocks seem to be back in style if they're not going with the constant "toss that screen up to a hologram" bit.

Sometime there will be a nice interview of such on the design that goes into that, not necessarily for "hacker sequences" but general imaginary computer interfaces like https://www.hudsandguis.com/home/2021/theexpanse .

DJPocari 4 years ago |

This is fantastic. I once implemented SHA-256 in Google Sheets to visualize it, but it had horrible performance compared to this. This is the best visualization I've seen yet.

westurner 4 years ago |

SHA2: https://en.wikipedia.org/wiki/SHA-2

https://rosettacode.org/wiki/SHA-256

Hashcat's GPU-optimized OpenCL implementation: https://github.com/hashcat/hashcat/blob/master/OpenCL/inc_ha...

Bitcoin's CPU-optimized sha256.cpp, sha256_avx2.cpp, sha256_sse4.cpp, sha256_sse41.cpp: https://github.com/bitcoin/bitcoin/blob/master/src/crypto/sh...

https://github.com/topics/sha256 https://github.com/topics/sha-256

Cryptographic_hash_function#Cryptographic_hash_algorithms: https://en.wikipedia.org/wiki/Cryptographic_hash_function#Cr...

Merkle–Damgård construction: https://en.m.wikipedia.org/wiki/Merkle%E2%80%93Damg%C3%A5rd_...

(... https://rosettacode.org/wiki/SHA-256_Merkle_tree ... Merkleized IAVL+ tree that is balanced with rotations in order to optimize lookup,: https://github.com/cosmos/iavl

Self-balancing binary search tree: https://en.wikipedia.org/wiki/Self-balancing_binary_search_t... )

daenz 4 years ago |

You should be very satisfied with how well you conveyed the algorithm. Well done. What was your approach to arriving at this result?

manceraio 4 years ago | |

Thanks :) I first implemented sha256 in js to understand its inner workings. Then I started displaying its variables with react and adding this stepped mechanism. Finally, I added some notes on the left to add some context of what is going on.

daenz 4 years ago | | |

Very nice. Have you built any other algorithm visualizations? I have a very strong interest in how algorithms are visualized so that they are more easily understood.

chris_l 4 years ago |

Someone should make a collection of the various visualization web projects that crop up here from time to time.

jonathanyc 4 years ago |

I love single-purpose websites like this that put a potentially complex implementation behind an elegantly simple interface. This website’s design and styling are pretty too :) Another useful one is https://www.h-schmidt.net/FloatConverter/IEEE754.html . I’d say even https://godbolt.org/ counts!

bmitc 4 years ago |

Does anyone have any good references, preferably a book but a good detailed website is fine, on cryptography, hashing, public/private keys, tokens, encryption, etc. as it relates to a software engineer? I don't necessarily want to know all the nitty gritty details of how these things are implemented. Rather, I think I would prefer simply understanding them and how to use them, piece them together, etc. to build something out of them.

I just have very little knowledge in this area. I'm going through a how to build a blockchain book right now, and I find myself struggling a little bit where I'm just calling some library functions but not necessarily knowing how to compose things properly.

manceraio 4 years ago | |

For that I like this one: https://cryptobook.nakov.com/

bmitc 4 years ago | | |

Thanks!

lanecwagner 4 years ago | |

I wrote this one: https://app.qvault.io/course/6321ddbf-49eb-4748-9737-6bc12e8...

it's a crypto course where you write the solutions in Go. you might enjoy it :)

tptacek 4 years ago | |

JP Aumasson is one of the authors of the BLAKE hashes and wrote "Serious Cryptography":

https://www.amazon.com/Serious-Cryptography-Practical-Introd...

bmitc 4 years ago | | |

Thanks!

anonymousDan 4 years ago |

I have an odd request regarding e.g. SHA-3. Can anyone tell me if it is implemented in a way that is in a sense 'one-pass' over its input, i.e. each byte of its input in memory is accessed only once, after which all of the algorithm state is held in registers and the original input is never accessed again? My scenario is one where I'm concerned about TOCTOU-like attacks on the memory where the input is stored, but I don't want to pay the overhead of first copying the whole input to a 'safe' memory location, e.g. imagine I have kernel code wanting to compute a hash over data stored in userspace.

oib 4 years ago | |

Yes, sha3 reads every input byte only once. It does hold a pretty large internal state that doesn't fit in only registers.

a-dub 4 years ago |

this is funny. when i first learned the algorithm, i found some matlab code that computes it with bit vectors. i added support for displaying them as an image and used the movie feature to generate step by step movies to build intuition.

nice to see someone build something polished that visualizes it in the same way. once you look at the mechanics for each round of the compression function and see the bits get swirled around for yourself, it starts to make intuitive sense.

the other big intuitions are of course, the trapdoor nature of add mod 2^32 (which is implicit in unsigned integer overflow on many machines) and the fact that some operations (like xor) operate in galois field 2, while others (like addition) operate in galois field 32 and the repeated stacking of the operations in different fields gives the function it's nonlinear trapdoor property.

i remember reading a pretty good paper on the arx (add, rotate, xor) family of ciphers back in the day (sort of in the vein of, is that all you need?)...

Darkphibre 4 years ago |

Man, this is amazing. I had to hand-unroll bit packing in a binary encoding scheme we used in a game. Rare enough that making a tool wasn't worth it, but damn I love your visualizations! Doing something like that would have helped others understand how I was "seeing the matrix."

recursive 4 years ago |

On the third step(?) of the second step, it says "Copy 2nd chunk into 1st 16 words", but it's accompanied by a visualization of copying the 1st chunk into the 1st 16 words. Am I just totally misunderstanding something?

seumars 4 years ago |

Fantastic presentation! The utility functions from the source code are just as useful.

ansible 4 years ago |

Is there a library or application that can take an annotated algorithm, and then generate a website like this one? That would be great for beginning CS and the sorting algorithms and other basic data structures too.

abrookewood 4 years ago |

Looks fantastic, but the only thing missing is why each step is done.

fthtls 4 years ago |

great visualization. i've also checked the source code and utility functions. they are very well defined and useful too.

i've coded a sha256 decrypter recently which uses dictionary attack and brute force. I read lots of articles about sha256 while coding this tool. there were still some missing parts on my mind, but your project clarified all.

btw, the decrypter i coded -> https://10015.io/tools/sha256-encrypt-decrypt

based2 4 years ago |

https://en.wikipedia.org/wiki/Secure_Hash_Algorithms

anandsuresh 4 years ago |

Pretty cool. Have been looking for something like this for a while. Thanks for building it.

Just sent you a PR for some typos I found while running through an example.

M4tze 4 years ago |

Great visualization. Might become my new screensaver.

colejohnson66 4 years ago | |

One could even feed the hash back into itself to get a different result each time

em3rgent0rdr 4 years ago | | |

how long would the cycle last before it starts repeating?

stevofolife 4 years ago |

Well made! Can you share how you made the website?

sylware 4 years ago |

Can we have a video of this on youtube/dailymotion/vimeo/etc which we can download with yt-dlp?

haunter 4 years ago | |

You can screen record the full screen browser window with OBS (with the mouse disabled too)

reincarnate0x14 4 years ago |

That's super cool. Visualized or illustrated algorithms have always looked so magical, to me at least.

nwatab 4 years ago |

Looks bautiful. I understand it is really complicated (even I know calculation of SHA256)

dicroce 4 years ago |

Please do this for b+trees. :)

hombre_fatal 4 years ago |

Beautiful color palette.

berta 4 years ago |

This is awesome!

iqanq 4 years ago |

Two of the buttons at the top have no "title" attribute and therefore no tooltip.

jerpint 4 years ago |

Very satisfying!

// X A B X^1 X^-1 :: Difference 471490377 6 13 = 1365552781 = 471490377 :: 0 1528396978 9 11 = -1576695076 = 1528396978 :: -0 1592322722 9 20 = 622346385 = 1592322722 :: -0 1214152986 8 16 = -1748578289 = 1214152985 :: -1 1193897367 2 16 = 907713766 = 1193897366 :: -1 335642564 9 10 = 318891964 = 335642564 :: -0 486208953 16 23 = 894211128 = 486208952 :: -1 629577059 13 14 = 1383225523 = 629577058 :: -1 1609442937 8 18 = 674046110 = 1609442937 :: -0 234450967 6 12 = -459008694 = 234450966 :: -1 1840721644 19 28 = -602984005 = 1840721644 :: -0