Node.js - A Giant Step Backwards

Node.js - A Giant Step Backwards(fenn.posterous.com)

102 points by Fenn 14 years ago | 114 comments

rbranson 14 years ago |

This purist evented I/O fundamentalism has to stop.

While evented I/O is great for a certain class of problems: building network servers that move bits around in memory and across network pipes at both ends of a logic sandwich, it is a totally asinine way to write most logic. I'd rather deal with threading's POTENTIAL shared mutable state bullshit than have to write every single piece of code that interacts with anything outside of my process in async form.

In node, you're only really saved from this if you don't have to talk to any other processes and you can keep all of your state in memory and never have to write it to disk.

Further, threads are still needed to scale out across cores. What the hell do these people plan on doing when CPUs are 32 or 64 core? Don't say fork(), because until there are cross-process heaps for V8 (aka never), that only works for problems that fit well into the message-passing model.

benatkin 14 years ago | |

redis is pretty badass for talking to other processes without really talking to other processes, and it's the #10 most depended-upon library on npm right now. http://search.npmjs.org/

It won't work for every problem, of course.

dnode is a good way to easily talk to other node.js processes without the HTTP overhead. It can talk over HTTP too, with socket.io.

node-http-proxy is useful as a load balancer, and a load balancer can distribute work between cores.

Finally, most of the node.js people I've met, online and offline, are polyglots, and are happy to pick a good tool for a job. But right now node.js has great libraries for realtime apps, the ability to share code on the client and server in a simple way, and good UI DSLs like jade, less, and stylus.

rbranson 14 years ago | | |

Huh? how does any of this keep me from having to write callback spaghetti? If I send a call using redis or dnode or whatever, I have to wait for it, so that means a callback.

I feel you about the polyglot and tend to agree, but I think some people are really trying to force awkward things into node, like people attempting to write big full-stack webapps using it.

thristian 14 years ago |

It seems a bit cruel that he mentions "horror stories" about Twisted; most of the culture shock people complain about with Twisted is exactly the kind of flow-control shenanigans that he describes in Node.js. In fact, Twisted makes those particular examples easier.

To handle branching flow-control like 'if' statements, Twisted gives you the Deferred object[1], which is basically a data structure that represents what your call stack would look like in a synchronous environment. For example, his example would look something like this, with a hypothetical JS port:

    d = asynchronousCache.get("id:3244"); // returns a Deferred
    d.addCallback(function (result) {
	if (result == null) {
	    return asynchronousDB.query("SELECT * from something WHERE id = 3244");
	} else {
	    return result;
	}
    });
    d.addCallback(function (result) {
	// Do various stuff with myThing here
    });

Not quite as elegant as the original synchronous version, but much tidier than banging raw callbacks together - and more composable. Deferred also has a .addErrback() method that corresponds to try/catch in synchronous code, so asynchronous error-handling is just as easy.

For the second issue raised, about asynchronous behaviour in loops, Twisted supplies the DeferredList - if you give it a list (an Array, in JS) of Deferreds, it will call your callback function when all of them have either produced a result or raised an exception - and give you the results in the same order as the original list you passed in.

It is a source of endless frustration to me that despite Twisted having an excellent abstraction for dealing with asynchronous control-flow (one that would be even better with JavaScript's ability to support multi-statement lambda functions), JavaScript frameworks generally continue to struggle along with raw callbacks. Even the frameworks that do support some kind of Deferred or Promise object generally miss some of the finer details. For example, jQuery's Deferred is inferior to Twisted's Deferred: http://article.gmane.org/gmane.comp.python.twisted/22891

[1]: http://twistedmatrix.com/documents/current/core/howto/defer....

benatkin 14 years ago | |

Here's a library in JavaSCript that does this. https://github.com/kriszyp/promised-io

The differences between your example and the common JavaScript practice for promises (when they're used; most of the time they aren't) are that then is used instead of addCallback and that chaining is available and taken advantage of.

thristian 14 years ago | | |

What do you mean by 'chaining'? Twisted's Deferreds provide at least a couple of things one might casually describe as 'chaining'.

pkulak 14 years ago | |

This is a great library for doing those sorts of things with Node: https://github.com/caolan/async

fedd 14 years ago | |

can Deferred work so fast that it don't call some of callback functions because they were not added yet?

sausagefeet 14 years ago | | |

No.

stock_toaster 14 years ago | |

inlinecallbacks also makes things easier, when applicable.

daleharvey 14 years ago |

This is a pretty common pattern for any work you have to do asynchronously, pretty much all libraries should be implementing this for you so the first 3 lines should be all you code

   getSomething("id", function(thething) { 
       // one true code path
   });

   function getSomething(id, callback) {
     var myThing = synchronousCache.get("id:3244");
     if(myThing) { 
         callback(null, myThing);
     } else { 
         async(id, callback);
     }
   }

a minor quibble with language style isnt exactly what I would call "A Giant Step Backwards"

gizzlon 14 years ago | |

var myThing = synchronousCache.get("id:3244");

I was under the impression that you could not do _anything_ synchronous? What if the call blocks for 100ms? or 1000ms? Won't that delay all other clients and all other requests?

howsta 14 years ago | |

Amen to that. The power of Node is this language style already makes tons of sense to experienced front-end developers who deal with async flow for everything they do.

Fenn 14 years ago | |

I heartily agree - I may have overstated my case slightly with the "Giant Step Backwards", it just definitely seemed that way on initial experience.

It also made for a better title than "Confusion, Then Indifference, Slowly Turning Into Understanding & Affinity"

randylahey 14 years ago | | |

It's kind of a dick move to make a inflammatory, flamebait blog post and then later admit you didn't actually know what you were talking about. Maybe you should have actually understood the technology you were working with before making knee-jerk snap judgements.

pkulak 14 years ago |

In my experience Node.js is more difficult than synchronous code. But it's also, by far, the easiest way to get something running that's massively parallel.

I recently wrote a project that needs to do 100's or 1000's of possibly slow network requests per second. The first try was Ruby threads. That was a disaster (as I should have predicted). I had an entire 8-core server swamped and wasn't getting near the performance I needed.

The next try was node. I got it running and the performance was fantastic. A couple orders of magnitude faster than the Ruby solution and a tenth of the load on the box. But, all those callbacks just didn't sit right. Finding the source of an exception was a pain and control flow was tricky to get right. So, I started porting to other systems to try to find something better. I tried Java (Akka), EventMachine with/without fibers, and a couple others (not Erlang though).

I could never get anything else close to the performance of Node. They all had the same problems I have with Node (mainly that if something breaks, the entire app just hangs and you never know what happened), but they were way more complicated, _harder_ to debug, and slower.

I have a new appreciation for Node now. And now that I'm much more used to it, it's still difficult to do some of the more crazy async things, but I enjoy it a lot more. It's a bit of work, and you have to architect things carefully to avoid getting indented all the way to the 80-char margin on your editor, but you get a lot for that work.

benatkin 14 years ago |

Nice article. Be sure to read the comments, as the author links to a library that makes the second example easy to rewrite in a short and elegant way. https://github.com/caolan/async#forEach

Also the first example, the cache hitting and missing, could be rewritten with async, too.

    async.waterfall([
      function(callback) {
        asynchronousCache.get("id:3244", callback);
      },
      function(myThing, callback) {
        if (myThing == null) {
          asynchronousDB.query("SELECT * from something WHERE id = 3244", callback)
        } else {
          callback(myThing)
        }
      },
      function(myThing, callback) {
        // We now have a thing from the DB or cache, do something with result
        // ...
      }
    ]);

moe 14 years ago | |

Excuse me, but that's what you call elegant?

From a readability standpoint I'll take the "old" version any day:

   function getFromDB(foo) {
      var result = asynchronousCache.get("id:3244");
      if ( null == result ) {
         result = asynchronousDB.query("SELECT * from something WHERE id = 3244");
      }
      return result;
   }

yummyfajitas 14 years ago |

Just curious, is there a good futures library for Node? If so, you could write code of the form:

    x = db.getFutureResult("x");
    y = db.getFutureResult("y");
    whenFuturesReady([x,y], callback(x, y) {
        useResults(x,y);
    });

This looks reasonably similar to typical synchronous code,

    x = db.getResult("x")
    y = db.getResult("y")
    useResults(x,y)

but it allows db queries to happen simultaneously and doesn't break the node paradigm.

papaf 14 years ago | |

It never became production ready, but I found this article an interesting discussion of the problem and a solution to it in Coffeescript:

http://gfxmonk.net/2010/07/04/defer-taming-asynchronous-java...

richcollins 14 years ago |

It's unfortunate that Node doesn't use actor based coroutines. Most of the code readability issues would go away.

moe 14 years ago | |

Count me in.

There's a die-hard core of callback proponents (especially in twisted- and lately in node-land) who claim the pure callback-style is more predictable, robust and testable.

This is not my experience. I've been through that with twisted (heavily), some with EventMachine and some with node.js.

The range of use-cases where I'd benefit from that style was extremely narrow.

For most tasks it would turn into a tedium of keeping track of callbacks and errbacks, littering supposedly linear code-paths with a ridiculous number of branches, and constantly working against test-frameworks that well covered the easy 90% but then fell down on the interesting 10% (i.e. verifying the interaction between multiple requests or callback-paths).

I'm sticking to coroutines where possible now (eventlet/concurrence) and remain baffled over the node-crew's resistance against adding meaningful abstractions to the core.

I like javascript a lot (more so with coffee), but I see little benefit in dealing with the spaghetti when that doesn't even give me transparent multi-process or multi-machine scalability.

And to prevent the obligatory: Yes, I know about Step, dnode and the likes. They remain kludges as long as the default style (i.e. the way all libraries and higher level frameworks are written) is callback-bolognese.

richcollins 14 years ago | | |

The best implementation of concurrency that I've used: http://www.iolanguage.com/scm/io/docs/IoGuide.html#Concurren...

olegp 14 years ago |

It's possible to have the best of both worlds by using node-fibers (https://github.com/laverdet/node-fibers) and mixing synchronous and asynchronous styles as appropriate.

I believe that JavaScript could become the dominant language on the server. We just need to have a set of consistent synchronous interfaces across the major server side JavaScript platforms. This would allow for innovation and code reuse higher up the stack.

I'm doing my bit by maintaining Common Node (https://github.com/olegp/common-node), which is a synchronous CommonJS compatibility layer for Node.js.

jjm 14 years ago |

I would rather have seen "Async programming or Node.JS is hard to 'get right'" than a "Step backward" - which I don't see it being at all.

sausagefeet 14 years ago |

Nitpick: Node doesn't do anything in parallel, it can't, it does things concurrently.

IbJacked 14 years ago | |

But, um, concurrently means "at the same time," too. Merriam's first definition is actually "running parallel."

Wouldn't it be better to describe it as running serially, using non-blocking asynchronous function calls? Guess that doesn't really roll of the tongue, though.

sausagefeet 14 years ago | | |

In computer science the two terms are generally distinguished, while you're free to redefine them commonly when people say concurrency they refer to a model of performing things at the same time but may not actually be simultaneous. Parallelism is generally the actual act of it running simultaneously.

randylahey 14 years ago | |

Nitpick: Incorrect. Node.js uses a thread-pool to run things which are necessarily blocking, such as some POSIX functions.

gcr 14 years ago | | |

sausagefeet meant we can't get to that thread pool from user code.

Fenn 14 years ago | |

Hmm, nice spot. Incorrect assumption of synonyms on my part.

Maro 14 years ago |

When you write an async. server in C++, where you can't inline functions, you write functions like OnRead(), OnWrite(), etc. Once you get used to it the whole thing ends up fairly easy to read and understand. Eg.

https://github.com/scalien/scaliendb/blob/master/src/Framewo...

I guess the OP is saying inlining [in a language where this is even possible] leads to unreadable code, which sounds about right.

robinhowlett 14 years ago |

Not to nitpick but the linked article's title is "NODE.JS - A GIANT STEP BACKWARDS?". That question mark is important.

lea 14 years ago |

I would write something like this:

  function handler(yes, no) {
      return function (err, data) {
          if (data) {
              yes(err, data);
          }
          else {
              no(err, data);
          }
      }
  }

  function get() {
      function done(err, data) {
  
          // do something with data
  
      }
      function db() {
          asynchronousDb.query("SELECT * fomr something where id = 3244", done);
      }
      asynchronousCache.get("id:3244", handler(done, db));
  }

karavelov 14 years ago |

Though async event driven programming is somehow confusing in the beginning, there are some idioms that could make your code more comprehensible.

My experience (mostly in perl - EV,AnyEvent, etc.) is that combining evens with finite state machines gives more structured code, with smaller functions that interact in predefined manner.

Detrus 14 years ago |

The V8 team is thinking of adding yield/defer support to make programming in Node neater. There's hope yet.

Meanwhile there are other choices that are about as easy, like Python libraries and Google's Go. Too bad they don't have the same zealous community support.

azakai 14 years ago | |

> The V8 team is thinking of adding yield/defer support to make programming in Node neater. There's hope yet.

There is SpiderNode, not sure what the status of it is, but it replaces V8 in node.js with SpiderMonkey. SpiderMonkey already has yield and much other new JS syntactic sugar.

http://blog.zpao.com/post/4620873765/about-that-hybrid-v8mon...

daakus 14 years ago | |

Marcel Laverdet has built a node extension that provides yield: https://github.com/laverdet/node-fibers

Fenn 14 years ago | |

Hmmm, interesting! I hadn't heard about that - Do you have a reference URL?

Detrus 14 years ago | | |

http://youtu.be/seX7jYI96GE?t=52m35s

Mentions they're working closely with the node team here. And that whole talk is about fixing up JavaScript into a modern language, remove the weird syntax quirks around classes, modules, etc. Say what you mean instead of the weird closure soup.

sausagefeet 14 years ago | | |

TameJS adds some nice primitives on top of Node.

Tichy 14 years ago |

While I am also drawn to NodeJS, I wonder if it wouldn't make more sense to use a language that supports coroutines. Not sure which ones would apply - probably Racket, as they seem to do everything?

moonlighter 14 years ago |

I liked the post, sans the attention-grabberish title (which made me read the darn thing to begin with, though ;)

nirvana 14 years ago |

If you're interested in keeping up to date with the project I describe below, please follow me on twitter @NirvanaCore.

I had many of the same concerns with node.js. Every time I attempted to wrap my head around how I'd write the code I needed to write, it seemed like node was making it more complicated. Since I learned erlang several years ago, and first started thinking about parallel programming a couple decades ago, this seemed backwards to me. Why do event driven programming, when erlang is tried and true and battle tested?

The reason is, there isn't something like node.js for erlang, and so I set out to fix that.

For about a year I've been thinking about design, and for a couple months I've been implementing a new web application platform that I'm calling Nirvana. (Sorry if that sounds pretentious. It's my personal name- I've been storing up over a decades worth of requirements for my "ideal" web framework.)

Nirvana is made up of an embarrassingly small amount of code. It allows you to build web apps and services in coffeescript (or javascript) and have them execute in parallel in erlang, without having to worry too much about the issues of parallel programming.

It makes use of some great open source projects (which do all the heavy lifting): Webmachine, erlang_js and Riak. I plan to ship it with some appropriate server side javascript and coffee script libraries built in.

Some advantages of this approach: (from my perspective)

1) Your code lives in Riak. This means rather than deploying your app to a fleet of servers, you push your changes to a database.

2) All of the I/O actions your code might do are handled in parallel. For instance, to render a page, you might need to pull several records from the database, and then based on them, generate a couple map/reduce queries, and then maybe process the results from the queries, and finally you want to render the results in a template. The record fetches happen in parallel automagically in erlang, as do the map/reduce queries, and components defined for your page (such as client js files, or css files you want to include) are fetched in parallel as well.

3) We've adopted Riak's "No Operations Department" approach to scalability. That is to say, every node of Nirvana is identical, running the same software stack. To add capacity, you simply spin up a new node. All of your applications are immediately ready to be hosted on that node, because they live in the database.

4) Caching is built in, you don't have to worry about it. It is pretty slick- or I think it will be pretty slick-- because Basho did all the heavy lifting already in Riak. We use a Riak in-memory backend, recently accessed data is stored in RAM on one of the nodes. This means each machine you add to your cluster increases the total amount of cache RAM available.

5) There's a rudimentary sessions system built in, and built in authentication and user accounts seem eminently doable, though not at first release. Also templating, though use any js you want if you don't like the default.

So, say, you're writing a blog. You write a couple handlers, one for reading an article, one for getting a list of articles and one for writing an article. You tie them to /, /blog/article-id, and /post. For each of these handlers, any session information is present in the context of your code.

To get the list of articles, you just run the query, format the results as you like with your template preference and emit the html. If it is a common query, you just set a "freshness" on it, and it will be cached for that long. (EG: IF you post new articles once a week, you could set the freshness to an hour and it would pull results from the cache, only doing the actual query once an hour.)

To display a particular article, run a query for the article id from the URL (which is extracted for you) and, again this can be cached. For posting, you can check the session to see if the person is authorized, or the header (using cookies) and push the text into a new record, or update an existing record. Basically this is like most other frameworks, only your queries are handled in parallel.

The goal is to allow rapid development of apps, easy code re-use, and easy, built-in scalability, without having to think much about scalability, or have an ops department.

This is the very first time I've publicly talked about the project. I think that I'm doing something genuinely new, and genuinely worth doing, but its possible I've overlooked something important, or otherwise embarrassed myself. I don't mean to hijack this thread, but felt that I needed to out my project sometime. A real announcement will come when I ship.

If you're interested in keeping up to date with the project I describe above, please follow me on twitter @NirvanaCore.

EDIT TO ADD: -- This uses Riak as the database with data persisted to disk in BitCask. The Caching is done by a parallel backend in Riak (Riak supports multiple simultaneous backends) which lives in RAM. So, the RAM works as a cache but the data is persisted to disk.

rmason 14 years ago |

I think it's a job for the jquery team, a node.js for the rest of us. Once they get jquery mobile out the door it would seem to be the most obvious next project.

walrus 14 years ago | |

node.js is node.js for the rest of us :) jQuery smooths out browser inconsistencies and replaces an overly verbose API (DOM). node.js has neither of these issues.

gcr 14 years ago | |

Huh? I don't follow -- do you mean that jQuery is teaching people to use anonymous functions everywhere and that's sometimes similar to how some people code with node?