The Node.js cpu blocking thing

UPDATE: TLDR “Because nothing blocks, less-than-expert programmers are able to develop fast systems” says Node’s About page, yet it is actually easy for programmers to block Node’s event loop, thereby reducing concurrency.  Go’s approach to concurrency makes it easier for programmers to take full advantage of multi-core processors. 

Recently a budding entrepreneur told me they were using CoffeeScript, Node.js and MongoDB to create their server application.  I asked if they were aware that by design Node was single-threaded, so the server would block on cpu intensive code. The response was a puzzled look followed by some mumbling about how Node doesn’t block because it uses an event loop and asynchronous callbacks.

This reminded me of Ted Dziuba’s highly inflammatory post Node.js is Cancer where many on HackerNews completely missed the issues raised. With JGC’s recent benchmarking adventure To boldly Go where Node man has gone before, and history somewhat repeating itself in the HackerNews comments, I figured it was time to create my own example to demonstrate that Node.js cpu blocking thing.

Building on previous examples, consider a simple server which upon receiving a request, performs an entirely contrived piece of work (since “fibonacci related benchmarks should die in a fire” :-).
UPDATE: Any work that consumes cpu cycles is real work, so feel free to insert your own file encryption or photo filtering code.

Here’s the Go server code:

Here’s the Node server code:

To benchmark, we’ll run ab on Ubuntu and ask it to connect to the Go (1.0.1) or Node (0.6.18) server running on OS X 10.7.3 with quad-core i7 processor and 4GB ram.  Here are the results of making 1,000 requests.

ab -n 1000

Time taken for tests:   54.994 seconds
Requests per second:    18.18 [#/sec]
Time per request:       54.994 [ms]
Transfer rate:          1.70 [Kbytes/sec]

Time taken for tests:   58.182 seconds
Requests per second:    17.19 [#/sec]
Time per request:       58.182 [ms]
Transfer rate:          1.28 [Kbytes/sec]

Let’s make another 1,000 requests, this time with 100 concurrent requests.

ab -n 1000 -c 100

Go (process launched with environment variable GOMAXPROCS=4)
Time taken for tests:   17.416 seconds
Requests per second:    57.42 [#/sec]
Time per request:       17.416 [ms]
Transfer rate:          5.38 [Kbytes/sec]

Time taken for tests:   50.601 seconds
Requests per second:    19.76 [#/sec]
Time per request:       50.601 [ms]
Transfer rate:          1.47 [Kbytes/sec]

As you can see, when requests are sent to the server concurrently, the Go server speeds up dramatically and is much faster than the Node server. It’s almost as if the Node server can’t handle concurrent requests and is simply processing them one at a time.

So what’s happening? Well, for each incoming request, the Go server kicks off a goroutine, which takes advantage of the quad-core processor to perform processing in parallel. Goroutines are presumably multiplexed over multiple OS threads, but that’s a runtime implementation detail.  By contrast, the Node server only has a single thread of execution, so the event loop is blocked while it’s processing.

If your brain is starting to think about algorithms, real-world latency and optimisation, stop!  The benchmarks don’t really matter.  The important thing to take away here is that a Node server’s event loop can be easily blocked.  This might come as a shock to some Node beginners but it shouldn’t. Quoting from Tom’s introductory Node book :

This single-threaded concept is really important. One of the criticisms leveled at Node.js fairly often is its lack of “concurrency.” That is, it doesn’t use all of the CPUs on a machine to run the JavaScript.”

“Because Node relies on an event loop to do its work, there is the danger that the callback of an event in the loop could run for a long time. This means that other users of the process are not going to get their requests met until that long-running event’s callback has concluded.” 

As we’ve mentioned, Node is single-threaded. This means Node is using only one processor to do its work. However, most servers have several “multicore” processors, and a single multicore processor has many processors. A server with two physical CPU sockets might have “24 logical cores”—that is, 24 processors exposed to the operating system. To make the best use of Node, we should use those too. So if we don’t have threads, how do we do that?”

Fortunately, there’s no need to fret. To take advantage of multiple processors, simply use a package called Cluster, which forks child processes to run as individual Node servers.  So here’s the code for the Node server again, this time using Cluster.

As expected, when four Node servers are launched, the results improve substantially.

ab -n 1000 -c 100

Go (1 process launched with environment variable GOMAXPROCS=4)
Time taken for tests:   17.416 seconds
Requests per second:    57.42 [#/sec]
Time per request:       17.416 [ms]
Transfer rate:          5.38 [Kbytes/sec]

Node Cluster (4 processes launched)
Time taken for tests:   26.967 seconds
Requests per second:    37.08 [#/sec]
Time per request:       26.967 [ms]
Transfer rate:          2.75 [Kbytes/sec]

Using Cluster looks simple, but if you’re a conscientious programmer, you now have a few things to worry about.  For example “Forking a new worker when a death occurs” or “Monitoring worker health using message passing”.  Digging deeper, you might even begin to question the very use of Cluster!  Node add-ons like WebWorkersFiber (non-preemptive scheduling) and Threads a Go Go (real threads) offer alternative approaches. Confused yet?
UPDATE: Here’s an example!topic/nodejs/RS5Whcqbgq4/discussion

By contrast, Go has concurrency baked-in. Goroutines and channels provide a simple and elegant approach to writing fast server applications. A single Go server offers a level of concurrency matched only by a cluster of Node servers. There are many reasons to like Go which is why it’s my language of choice.

Node is great for Javascript developers but for everyone else there’s probably already a comparable solution close to home. As for buzzword entrepreneurs, there are always greener pastures!

UPDATE: Node.js developers recently experimented with threads instead of processes.!msg/nodejs/zLzuo292hX0/F7gqfUiKi2sJ

Reddit/Golang Comments
HackerNews Comments

This entry was posted in go, golang, javascript, node. Bookmark the permalink.

8 Responses to The Node.js cpu blocking thing

  1. Anonymous says:

    Take a look at http://vertx.ioSimilarly to node, but for the JVM. Polyglot. And doesn't have the node blocking issue.

  2. Simon says:

    The major issue with VertX is that their web page isn't pretty like the others 🙂 Other similar projects are NodePHP and VibeD.

  3. Olov Lassus says:

    I'll let others comment on the benchmark itself but I thought you might want to know that your JS code doesn't do what I think that you think it does. It currently computes NaN 4 million times (you forgot to initialize result).restrict mode for JavaScript finds the error: "Error: restrict mode + called with undefined and number".

  4. Simon says:

    Thanks Olov, it was a test to see if anybody would actually read and run their own tests ;-).As you say yourself, "Even the best make mistakes due to loose semantics." ( might find interesting that in Go, by default, all variables are set to zero upon declaration.

  5. Je viens de fêter mon trenteième anniversaire.
    Aubrey à votre service
    J’ai repris des études pour etre traiteur ! Si je suis parfois curieuse, ce n’est pas pour
    autant un défaut ?

  6. Everything you say is true, except you don’t know what the word “blocking” means. In computer science, “blocked” does not mean “busy”. In fact, it means the opposite. “Blocked” means that a thread *cannot* do any work because it is waiting for something else to happen.

    So yes, Node.js typically does not block in the truest sense of the word, because most of its I/O is asynchronous. (This is how the process model is presented to the Node.js developer; obviously, under the hood, many operating systems and I/O libraries won’t have truly event-oriented APIs. In those cases, Node.js spawns worker threads to handle synchronous I/O to maintain the illusion of async I/O on the main thread.)

    If your thread is busy because it is doing something CPU intensive, that is not “blocking”. That’s just being busy.

  7. Pingback: NodeJS upload files 100% cpu usage | DL-UAT

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s