As much Stack Overflow as possible in 4096 bytes

As much Stack Overflow as possible in 4096 bytes(danlec.com)

382 points by df07 12 years ago | 72 comments

mberning 12 years ago |

Very impressive. I wish extreme performance goals and requirements would become a new trend. I think we have come to accept a certain level of sluggishness in web apps. I hate it.

I wrote a tire search app a few years back and made it work extremely fast given the task at hand. But I did not go to the level that this guy did. http://tiredb.com

bane 12 years ago | |

Now that we have blisteringly fast computers, it's worth it to browse old websites and see what "snappy" looks like.

http://info.cern.ch/hypertext/WWW/TheProject.html

If we could cram more modern functionality into say...twice or three times the performance of the above, I think the web would be a better place. Instead the web is a couple orders of magnitude slower.

deckiedan 12 years ago | | |

Yes. In some ways I think we're still in a very primative kind of level for web development. Either you do it by hand, tweaking each individual parameter like the old demoscene, and making it fast and amazingly small, or else you write huge chunky slow web apps, or more usually, something in the middle.

I feel like the big thing I'm missing is smart compilers that can take web app concepts, and turn them into extremely optimsed 'raw' HTML/CSS/JS/SQL/backend. All of the current frameworks still use hand written frequently very bloated or inelegant hand written CSS & HTML, and still require thinking manually about how and when to do AJAX when it's least offensive to the user. Maybe something like yesod ( http://www.yesodweb.com/ ) or something like that is heading in the right direction. http://pyjs.org/ has some nice ideas too... But I'm thinking of something bigger than the individual technologies like coffeescript or LESS... Something that doesn't 'compile to JS', or 'compile to CSS', but 'compile to stack'. I dunno. Maybe I'm just rambling.

TillE 12 years ago | | |

There are examples of advanced functionality performing well enough. Google Docs is quite fast, especially for what it is.

On the other hand, there are sites which are conceptually much simpler but incredibly sluggish. Twitter is a particularly bad offender after you've scrolled down a few pages. Or any other site that uses a ton of Ajax with little regard for the consequences.

dredmorbius 12 years ago | | |

I absolutely love TBL's initial documents. They're utterly semantic markup. Which means that you can apply a minimal amount of CSS to have them appear in a pleasant-to-read format. Let's see if I can find that pastebin .... Here: http://pastebin.com/7sGiHBwF

But, yeah. If webpages would just revert to what TBL had created (yes, I'll allow for images and minimal other frippery) things would be so much more manageable.

userbinator 12 years ago | |

> I wish extreme performance goals and requirements would become a new trend.

Not just performance, but efficiency - both speed and size. Sadly it seems that most of the time this point is brought up, it gets dismissed as "premature optimisation". Instead we're taught in CS to pile abstraction upon abstraction even when they're not really needed, to create overly complex systems just to perform simple tasks, to not care much about efficiency "because hardware is always getting better". I've never agreed with that sort of thinking.

I think it creates a skewed perception of what can be accomplished with current hardware, since it makes optimisation an "only if it's not fast/small enough/we can't afford new hardware" goal; it won't be part of the mindset when designing, nor when writing the bulk of the code. The demoscene challenges this type of thought; it shows that if you design with specific size/speed goals in mind, you can achieve what others would have thought to be impossible. I think that's a real eye-opener; by pushing the limits, it's basically saying just how extremely inefficient most software is.

bane 12 years ago | | |

> Instead we're taught in CS to pile abstraction upon abstraction even when they're not really needed, to create overly complex systems just to perform simple tasks, to not care much about efficiency "because hardware is always getting better". I've never agreed with that sort of thinking.

Right, exactly. It's obvious too that software has scaled faster than hardware in the sense that to do an equivalent task like say, boot to a usable state, takes orders of magnitude longer today than it it used to, despite having hardware that's also orders of magnitude faster.

So when I see demo of ported software that does something computing used to do back in the 90s (but slowly), I'm really only impressed by the massive towers of abstraction we're building on these days, but what we're actually able to do is not all that much better. To think that I'm sitting on a machine capable of billions of instructions per second, and I'm watching it perform like a computer doing millions, is frankly depressing.

All of this is really to make the programmers more efficient, because programmer time is expensive (and getting stuff out the door quicker is important), but the amount of lost time (and money) on the user's end, waiting for these monstrosities of abstraction to compute something must far far exceed those costs.

I'm actually of the opinion that developers should work on or target much lower end machines to force them to think of speed and memory optimizations. The users will thank them and the products will simply "be better" and continue to get better as machines get better automatically.

columbo 12 years ago | |

> I wish extreme performance goals and requirements would become a new trend.

Well, there will always be demoscene (http://www.youtube.com/watch?v=5lbAMLrl3xI ) which I've always found remarkable.

userbinator 12 years ago | | |

The Windows demos almost always use a lot of the system libraries for the bulk of their work., which hasn't impressed me quite as much as what you can do in 4k with bare DOS --- where the code is directly manipulating the hardware. No libraries, no GPU drivers:

http://www.youtube.com/watch?v=dGQEeArYDS8

dangoldin 12 years ago | |

I agree. I've been using Ghostery recently to see the external libraries that are loaded on various sites and it's ridiculous. Some sites are loading more than 50 extern scripts.

scott_karana 12 years ago | |

You're not kidding! The site is amazingly fast.

jc4p 12 years ago |

Some of the workarounds he mentions at the end of his Trello in 4096 bytes[1] post seem really interesting:

- I optimized for compression by doing things the same way everywhere; e.g. I always put the class attribute first in my tags

- I wrote a utility that tried rearranging my CSS, in an attempt to find the ordering that was the most compressible

[1] http://danlec.com/blog/trello-in-4096-bytes

Whitespace 12 years ago |

I'm curious if a lot of the customizations re:compression could be similarly achieved if the author used Google's modpagespeed for apache[0] or nginx[1], as it does a lot of these things automatically including eliding css/html attributes and generally re-arranging things for optimal sizes.

It could make writing for 4k less of a chore?

In any case, this is an outstanding hack. The company I work for has TLS certificates that are larger than the payload of his page. Absolutely terrific job, Daniel.

[0]: https://code.google.com/p/modpagespeed/

[1]: https://github.com/pagespeed/ngx_pagespeed

edit: formatting

lstamour 12 years ago | |

Well, the TLS problem is why we'd also want QUIC. But that's another story...

nej 12 years ago |

Wow navigating around feels instant and it almost feels as if I'm hosting the site locally. Great job!

derefr 12 years ago |

> I threw DRY out the window, and instead went with RYRYRY. Turns out just saying the same things over and over compresses better than making reusable functions

This probably says something about compression technology vs. the state of the art in machine learning, but I'm not sure what.

cobookman 12 years ago |

First off, nice work. I've noticed that St4k is loading each thread using ajax, where-as stackoverflow actually opens a new 'page', reloading a lot of webrequests. Disclaimer I've got browser cache disabled.

E.g on a thread click:

St4k:

GET https://api.stackexchange.com/2.2/questions/21840919 [HTTP/1.1 200 OK 212ms] 18:02:16.802

GET https://www.gravatar.com/avatar/dca03295d2e81708823c5bd62e75... [HTTP/1.1 200 OK 146ms] 18:02:16.803

stackoverflow.com (a lot of web requests):

GET http://stackoverflow.com/questions/21841027/override-volume-... [HTTP/1.1 200 OK 120ms] 18:02:54.791

GET http://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min... [HTTP/1.1 200 OK 62ms] 18:02:54.792

GET http://cdn.sstatic.net/Js/stub.en.js [HTTP/1.1 200 OK 58ms] 18:02:54.792

GET http://cdn.sstatic.net/stackoverflow/all.css [HTTP/1.1 200 OK 73ms] 18:02:54.792

GET https://www.gravatar.com/avatar/2a4cbc9da2ce334d7a5c8f483c92... [HTTP/1.1 200 OK 90ms] 18:02:55.683

GET http://i.stack.imgur.com/tKsDb.png [HTTP/1.1 200 OK 20ms] 18:02:55.683

GET http://static.adzerk.net/ados.js [HTTP/1.1 200 OK 33ms] 18:02:55.684

GET http://www.google-analytics.com/analytics.js [HTTP/1.1 200 OK 18ms] 18:02:55.684

GET http://edge.quantserve.com/quant.js

....and more....

SmileyKeith 12 years ago |

This is amazing. As others have said I really wish this kind of insane performance would be a goal for sites like this. After trying this demo I found it difficult to go back to the same pages on the normal site. Also I imagine even with server costs this would save them a lot of bandwidth.

masswerk 12 years ago |

And now consider that 4096 bytes (words) was exactly the total memory of a DEC PDP-1, considered to be a mainframe in its time and featuring timesharing and things like Spacewar!.

And now we're proud to have a simple functional list compiled into the same amount of memory ...

rangibaby 12 years ago | |

Your iPhone also had more computing power than the rest of the world. Combined! :-)

masswerk 12 years ago | | |

And it even can play Bach and connect to the network, like the PDP-1! :-)

afhof 12 years ago |

4096 is a good goal, but there is a much more obvious benefit at 1024 since it would fit within the IPv6 1280 MTU (i.e. a single packet). I recall hearing stories that the Google Homepage had to fit within 512 bytes for IPv4's 576 MTU.

tedd4u 12 years ago | |

One packet is great if you can do it. There's a big penalty after the sender in a new TCP connection reaches the initial transmit window. A lot of sites these days have configured this up from 2x or 3x MSS to 10x MSS (about 5,360 bytes) to increase what can be sent in the first transmission back from the server (HTTP response for example).

Dylan16807 12 years ago | | |

If they're configured for 10x they're probably also going to be using an MSS of 1460, so you can cram 14 kilobytes of data into the initial request.

jonalmeida 12 years ago |

Pages load almost instantly like as if it's a local webserver - I'm quite impressed.

blazespin 12 years ago |

Very impressive! So incredibly fast.

My only thoughts are that search is the real bottleneck.

Jakob 12 years ago |

I didn’t realize that the original site is already quite optimized. With a primed cache the original homepage results in only one request:

    html ~200KB (~33 gzipped)

Not bad at all. Of course the 4k example is even more stunning. Could the gzip compression best practices perhaps be added to an extension like mod_pagespeed?

kislayverma 12 years ago |

Very very awesome.

I'd take some trade-off between between crazy optimization and maintainability, but I'd definitely rather do this than slap on any number of frameworks because they are the new 'standard'.

Of course, the guy who has to maintain my code usually ends up crying like a little girl.

dclowd9901 12 years ago |

>"I threw DRY out the window, and instead went with RYRYRY. Turns out just saying the same things over and over compresses better than making reusable functions"

I would love to investigate this further. I've always had a suspicion that the aim to make everything reusable for the sake of bite size actually has the opposite effect, as you have to start writing in support and handling tons of edge cases as well, not to mention you now have to write unit test so anyone who consumes your work isn't burned by a refactor. Obviously, there's a place for things like underscore, jquery, and boilerplate code like Backbone, but bringing enterprise-level extensibility to client code is probably mostly a bad thing.

nathancahill 12 years ago |

This is really fast! Love it. I thought the real site was fast until I clicked around on this.

arocks 12 years ago |

Looks broken on my Android mobile, but seriously this is incredible!

Wonder how we can unobfuscate the source. It would be great if there is a readable version of the source as well, just like we have in Obfuscated C Code Contests. Or perhaps, some way to use the Chrome inspector for this.

rangibaby 12 years ago | |

Using HTML prettify on the source is a start at least:

https://github.com/victorporof/Sublime-HTMLPrettify

TacticalCoder 12 years ago |

In a different style, the "Elevated" demo, coded in 4K (you'll have a hard time believing it if you haven't seen it yet):

http://www.youtube.com/watch?v=_YWMGuh15nE

shdon 12 years ago |

His root element is "<html dl>". I'm not aware of the dl attribute even existing... Is that for compressibility or does the "dl" actually do something?

jazzdev 12 years ago |

Impressive, and a useful exercise, but it doesn't seem practical to give up DRY in favor of RYRYRY just because it compresses better and saves a few bytes.

iamdanfox 12 years ago |

The simpler UI is quite pleasant to use isn't it! I wonder if companies would benefit from holding internal '4096-challenges'?

nandhp 12 years ago |

Code is formatted in a serif font, instead of monospace, which seems like a rather important difference. Otherwise, it is quite impressive.

dubcanada 12 years ago | |

You most likely don't have Consolas, or Monaco then.

That font family should have been

Consolas,Monaco,monospace

Rather then

Consolas,Monaco,serif

But what ever :)

timtadh 12 years ago | | |

Yep: but fixing breaks the 4096 barrier:

    $ curl -s http://danlec.com/st4k | gzip -cd | sed 's/serif/monospace/' | gzip -9c | wc
        14      94    4098

netghost 12 years ago | |

If you're using Chrome, there's a bug in recent versions that seems to butcher font rendering at random.

Try popping open the inspector panel, and the fonts will magically correct themselves.

shawabawa3 12 years ago | |

It's monospace for me (chrome windows 7)

timtadh 12 years ago |

funny, his compressor must do a better job than mine:

    $ curl -s http://danlec.com/st4k | wc
         14      80    4096
    $ curl -s http://danlec.com/st4k | gzip -cd | wc
         17     311   11547
    $ curl -s http://danlec.com/st4k | gzip -cd | gzip -c | wc
         19     103    4098

bdonlan 12 years ago | |

Turn up the compression level:

    $ curl -s http://danlec.com/st4k | gzip -cd | gzip -9c | wc
         14      80    4096

timtadh 12 years ago | | |

Right. I feel silly for not trying that. Good spot.

scoopr 12 years ago |

There seems to be many bytes left! :)

   $ zopfli -c st4k |wc
      11     127    4050

slackito 12 years ago | |

Thanks for the pointer to zopfli. I've used p7zip in the past as a "better gzip", and it gets good results for this one too :D

  $ curl -s http://danlec.com/st4k | gzip -cd | 7z a -si -tgzip -mx=9 compressed.gz
  $ wc compressed.gz 
    14   84 4048 compressed.gz

dangayle 12 years ago |

I'd love to see a general list of techniques you use, as best practices.

thedufer 12 years ago | |

There's a short list at the end of the post about Trello4k: http://danlec.com/blog/trello-in-4096-bytes

dangayle 12 years ago | | |

Thanks. How much of that could we do during the original design phase?

tantalor 12 years ago |

> The stackoverflow logo is embedded?

Did you try a png data url? Could be smaller.

jpatel3 12 years ago |

Way to go!

stefan_kendall 12 years ago |

Maybe part of the story here is that gzip isn't the be-all-end-all of compression. A lot of the changes were made to appease the compression algorithm; seems like the algorithm could change to handle the input.

A specialized compression protocol for the web?