1. The Point and Those Who Miss It

    A lot has been said about Ted’s successful troll and the several responses that seem to miss the point—including mine.

    To that I say: if a point is never properly made, can one truly miss it?

    There are two reasons people are benchmarking Fibonacci in response to Ted’s post:

    1. Ted himself benchmarked it and concluded it was slow. This is wrong—it’s comparatively fast. But nobody thinks this was his entire point. It wasn’t. Let’s move on.
    2. Because it’s an order of magnitude faster, every single one of Ted’s clients in the ab benchmark would receive a response before even a single client of an equivalent Python or Ruby service, regardless of their level of concurrency. Even though each request blocked the next, it still would have far better throughput than your typical Python or Ruby deployment. (This of course depends on which language runtime the deployment uses, that’s why I’m stressing “typical.”) This has everything to do with the speed of V8, and not Node’s event-based I/O.

    So even the demonstration of a blocked request pipeline was rather poor. Still want to blame anyone for missing the point?

    People are excited about Node because you can handle a large number of requests before even beginning to think about parallelization. If you start with a blazing fast runtime and request handler, you can put off parallelization until it matters. And it may never matter, because most web services are I/O-bound. Node also makes asynchronous programming very approachable.

    Taken out of context, Node’s claim that “nothing blocks” can indeed be confusing. And I’m not defending the claim that “less-than-expert programmers are able to develop fast systems.” But Ted deliberately took “nothing blocks” out of context. Node’s site is very clear that it’s talking about blocking I/O. For CPU-intensive tasks, the Unix way is alive and well.

    Running multiple Node processes to improve concurrency is not anti-Unix. It is in fact a widely-accepted approach, albeit one users are left to discover themselves with Node. Some people dismissed projects that serve this purpose, claiming that the introduction of third-party add-ons somehow invalidates the entire approach. Maybe we should stop using mod_python and mod_wsgi then, eh?

    Perhaps there are some less-than-expert programmers out there shooting themselves in the foot with Node and failing spectacularly to scale. But so far they seem to be mythical.

    If you want to read an article that does make a great point about Node’s concurrency—and concurrency in general—Anthony Fejes got it right.

  2. Node.js Cures Cancer

    I apologize for the title. But it’s no less accurate than this embarassing, poorly-reasoned article by Ted Dziuba. It’s flamebait, but I couldn’t resist. Sigh…here we go.

    First, Ted takes issue with this claim on Node’s homepage:

    Almost no function in Node directly performs I/O, so the process never blocks. Because nothing blocks, less-than-expert programmers are able to develop fast systems.

    He then gives an example of a blocking function as some kind of weird disproof.

    Here’s a fun fact: every function call that does CPU work also blocks. This function, which calculates the n’th Fibonacci number, will block the current thread of execution because it’s using the CPU.

    function fibonacci(n) {
      if (n < 2)
        return 1;
      else
        return fibonacci(n-2) + fibonacci(n-1);
    }
    

    I think Ted might be confused about what exactly Node is. Node is not a language or a framework. Node is a small set of JavaScript modules bundled with a JavaScript runtime. “Almost no function in Node directly performs I/O” means that most functions included in Node’s built-in modules are asynchronous—they return as soon as possible and a callback handles the result. It doesn’t mean that any JavaScript function you write won’t block. It’s still JavaScript with normal semantics.

    The first thing Ted does to disprove this claim is fire up Node’s single-threaded HTTP server in a single process and calculate fibonacci(40) on every request. Here’s his conclusion when the request takes 5.6 seconds to complete:

    5 second response time. Cool. So we all know JavaScript isn’t a terribly fast language, but why is this such an indictment?

    I don’t know Ted, why is it? Maybe let’s try the same thing in Python and Ruby so we can see just how terribly fast other languages are by comparison. For reference, Ted’s example takes 8 seconds on my machine.

    Python:

    from wsgiref.util import setup_testing_defaults
    from wsgiref.simple_server import make_server
    
    def fibonacci(n):
        if n < 2:
            return 1
        else:
            return fibonacci(n-2) + fibonacci(n-1)
    
    def fibonacci_app(environ, start_response):
        status = '200 OK'
        headers = [('Content-Type', 'text/plain')]
        start_response(status, headers)
        return [str(fibonacci(40))]
    
    make_server('', 1337, fibonacci_app).serve_forever()
    

    Result:

    $ time curl http://localhost:1337/
    165580141
    real    1m48.732s
    user    0m0.009s
    sys     0m0.007s
    

    1 minute 48 second response time. Cool. This is the fastest it got after multiple runs, by the way. The results are the same for any Python web server because, surprise, it’s measuring the speed of Python. Just like Ted’s example is measuring the speed of V8.

    Ruby:

    require 'rubygems'
    require 'mongrel'
    
    def fibonnaci(n)
        if n < 2
            1
        else
            fibonnaci(n - 2) + fibonnaci(n - 1)
        end
    end
    
    class FibonacciHandler < Mongrel::HttpHandler
       def process(request, response)
         response.start(200) do |head,out|
           head["Content-Type"] = "text/plain"
           out.write(fibonnaci(40))
         end
       end
    end
    
    h = Mongrel::HttpServer.new("127.0.0.1", "1337")
    h.register("/", FibonacciHandler.new)
    h.run.join
    

    Result:

    $ time curl http://localhost:1337/
    165580141
    real    3m18.429s
    user    0m0.011s
    sys     0m0.009s
    

    3 minute 18 second response time. Cool.

    What exactly did he expect to happen in this case? We wrote a function that does block, and told Node to call it on every request. How could Node’s claim that “almost no function in Node directly performs I/O, so the process never blocks” possibly have any bearing at all on this result?

    Maybe the claim that “less-than-expert programmers are able to develop fast systems” is misleading, but so far it’s actually sounding almost reasonable. Getting the response time down to 5 seconds was effortless, after all…

    Ted then discredits Node for disobeying “the Unix way.” This indeed sounds pretty lamentable until you realize he just means that Node doesn’t ship with a CGI module, which is true.

    How odd, then, that Node’s lack of built-in CGI support hasn’t caused an exodus en masse. People still seem to be getting along just fine without it. And even if they weren’t, nothing is stopping anybody from developing a CGI module for Node. So why haven’t they?

    One reason could be that Node’s built-in web server can easily outperform Apache—even in high-concurrency tests. This is apparently dismissible since it violates “the Unix way” of separation of responsibility. All the people who are getting stuff done with Node should just drop what they’re doing, because according to Ted they’re conceptually impure.

    Never mind that most developers will never need the scale that loosely coupled services would provide.

    Conceptually, this is how any web application architecture that’s not cancer still works today: you have a web server program that’s job is to accept incoming requests, parse them, and figure out the appropriate action to take. That can be either serving a static file, running a CGI script, proxying the connection somewhere else, whatever. The point is that the HTTP server isn’t the same entity doing the application work.

    Look, I totally agree with this. From an architectural standpoint, it’s the way to go. I’m sure someday, Node will get its own equivalent of WSGI for Python and Rack for Ruby. That will be an exciting time. But nobody seems to be in a hurry to get there. [Update: Except for Connect and Strata.]

    Finally, Ted concludes that Node was just doomed from the start, since it’s written in JavaScript:

    if (typeof my_var !== "undefined" && my_var !== null) {
      // you idiots put Rasmus Lerdorf to shame
    }
    

    What is this I don’t even…

    Never mind that this code would be even uglier in, say, Python—presumably a non-cancerous language:

    try:
        my_var
    except NameError:
        pass
    else:
        if my_var is not None:
            # Ted needs better examples
            ...
    

    Some people like JavaScript now, Ted. We even have options for people who don’t.

    Look, I’m not that invested in Node, and I don’t have any interest in using its web server for anything. That’s just not what I use it for. But even I could spot the flaws in Ted’s childish article. Please don’t support his inflammatory trash just because you don’t like JavaScript.

    For a much more reasonable take on Node and its performance claims, I recommend this post by Alex Payne of Twitter.

    Discuss this post on Hacker News.

    [Update: I’ve posted a brief followup.]

  3. So, you want to use require() in the browser…

    You’ve written a bunch of JavaScript or CoffeeScript running on Node, which has helped you organize your code into modules, develop a test suite in any of numerous styles, experiment in a decent REPL, and take advantage of useful libraries.

    Now you want to deploy that code to the final frontier—the web browser.

    You have a couple options. A couple dozen, actually. Here are just the libraries that were easy to find, in order of GitHub popularity. The bars represent their number of watchers:

    As you can see, quite a few developers have approached this problem. There are several ways to interpret this:

    • Maybe it’s an easy problem to solve, since so many people have opted to roll their own.
    • Maybe it’s hard to solve, since so many people evidently decided that the others got it wrong.
    • Maybe each person has legitimately different requirements.
    • Maybe it’s just “opinionated”, like web frameworks.

    Downloading code on-demand: related, but different.

    Admittedly, several of these libraries are trying to solve a related, but different problem: downloading the required modules on-demand (usually asynchronously via AJAX).

    While an interesting problem to tackle, I think using this scheme in production is a bit misguided. Compiling your entire application into a single JavaScript file shouldn’t be a problem for anyone. Whose minified and compressed code is bigger than the size of a few JPEGs? (Hint: not Facebook’s, Grooveshark’s, or Pandora’s.) At most you should need a tiny loader script that grabs a couple of large, self-contained blobs of code when the page loads.

    Loading dozens of small modules on demand is just going to result in more HTTP requests and worse compression. The simpler alternative is to already have the modules available, and just run them and return their exports on-demand. Thus, you have one file and no asynchrony.

    Deploying the same code to production that you run in development: the actual problem.

    The larger and more compelling use-case that many of these libraries tackle is running the same code in the browser that you wrote for Node. The foremost obstacle here just happens to be the organization of code into CommonJS-like modules. Even if you avoid using Node’s core modules and globals like process and __filename, the use of require for your own modules still must be addressed.

    That’s where these libraries come in.

    I happen to have implemented yet another one of these libraries for my own purposes, so I’m familiar with the problem. In my next blog post, I’ll explain what requirements the most popular libraries satisfy, describe where they fall short, and demonstrate some improvements we could make to truly call them production-ready.

    Discuss this post on Hacker News.

  4. adminbrowse: Fancier changelist columns in Django’s admin site

    I just released a new reusable Django app called django-adminbrowse on GitHub. This was factored out of a Django project I’ve been working on for a while.

    Ever wish it was easier to get around in the Django admin? Maybe you’ve deployed the admin site as a management tool for some less technical users? This project lets you easily spiff up changelist pages to include better text and markup. I’ve been using it in production for a while and it’s been a welcome improvement.

    In the screenshot (click to zoom), you’ll notice:

    • URL fields become clickable links.
    • Foreign key fields link to the change form for the corresponding object.
    • Related objects get a link to their changelist page, filtered appropriately.

    There are a few more features, but the most powerful is that the code is very easily extensible. If you ever wanted to include a custom markup in your changelist columns, this makes it really easy.

    See the README or use Python’s help() for starters. More advanced documentation is in progress.

    Enjoy!

  5. The Numbers: Day 70

    More updates are coming! Erin and I will be taking a break from our trip next week in order to celebrate our birthday—October 30 for the both of us—back in Cleveland with friends and family. We’ll be in town from Wednesday, October 27 until Tuesday, November 2, when we’ll most likely continue our journey from Eugene, Oregon.

    • Miles since Cleveland: ~2,000
    • Cycling days: 44
    • Leisure days: 26
    • Rainy days: 4
    • Nights camped: 11
    • Nights in motels: 22½
    • Nights couchsurfed: 36½

    [We half-camped on two occasions—the night in Oberlin, which accounts for half a motel, and in someone’s yard, which I’ll call “campsurfing” and accounts for half a couchsurf.]