Eventlet: Asynchronous I/O for Grownups

lose-an-argument-like-a-man-say--well-i-guess-ill-just-go-fuck-myself-then.jpgEvent-driven asynchronous I/O is the newest chatter at the Silicon Valley High Abercrombie table.  Threading, the mode of parallelism we all thought we were so smart for understanding, isn't cool anymore. Everybody who is anybody is using asynchronous I/O, and of course, there are different opinions on how it should be done. This being the software world, you can count on those opinions being vehement.

If you look at the benchmarks, all of the major async libraries for Python are basically on the same operating plane. There's Twisted, Tornado, gevent, and a handful of others, but the one that really stands out in the group is Eventlet. Why is that? Two reasons:

1. You don't need to get balls deep in theory to be productive with Eventlet.
2. You need to modify very little pre-existing code to adapt a program to be event-driven.

Eventlet's approach is that asynchronous code should look like synchronous code. Why? Because it's easy for people to understand synchronous code.  Thinking about callbacks and schedulers is unnecessary, after all, we have work to do. What's more, not only does asynchronous code with Eventlet look synchronous, it can also run synchronously.

Look at this Python snippet:

def fetch_and_parse(url):
    contents = urllib2.urlopen(url).read()
    tree = lxml.html.fromstring(contents)
    # Do some parsing on the ElementTree
    return value
It looks like regular synchronous code, and ostensibly it is. The output of the URL fetch is the input to the HTML parser. However, if you have a ton of URLs to do this to, how would you parallelize it? Threads are an option, but so is Eventlet:

import eventlet
from eventlet.green import urllib2

def main():
    green_pool = eventlet.GreenPool(size = 10)
    results = []
    for result in green_pool.imap(fetch_and_parse, urls):
This is interesting because all I've done to make a seemingly synchronous piece of code run asynchronously is to patch the library it needs for I/O and give it a driver method. That driver class could have easily been a series of threads all reading from a Queue, and importing the standard library's version of urllib2.

Now hold on a second. This is a painfully contrived example, but it's such a key point: The asynchronous code looks synchronous. It can even function synchronously. All I did to make it use event-driven I/O is change the driver and patch a library. Now this is podracing!

That sort of integration has such a massive business value that I will easily disregard any pissing-contest performance gains that Twisted or Tornado may offer. I know that when you have code written in the "old" style, and the powers that be hand down the "new" style, there is an itch to re-write it, but rewriting known-working code is the worst thing you can do for your project.

The Eventlet developers have gone further than this, providing a facility to monkey-patch the existing system libraries at invocation time. For example, let's say you have a web app that does some Memcached I/O and some database I/O.

from eventlet import patcher
patcher.monkey_patch(all = True)
Oh look. Your application is now using asynchronous I/O. This call patches Python's socket module and a few others to make it all "just work" with Eventlet's internal coroutine switching mechanism. (Caveat: MySQLdb, which uses C-land sockets, needs a little bit of extra treatment, but it's only a couple of lines)

This all sounds great in theory, but I have actually made a large I/O bound program work using monkey patching and changing the driver. It is a piece of software that reads jobs from a queue and processes them, putting the result in memcached. For esoteric reasons I will not go into, the job processors could not thread the work, they had to fork. Using this setup, one production box with 8GB of RAM was consistently 7.5GB full. After a less than 5 line code change to the driver, that same production box uses only around 1GB of RAM consistently, and can handle 5 to 10x the throughput of the old system.

Now compare this to Twisted or Tornado. Twisted tries so damn hard to be Java that it really offends me personally. Those developers strike me as the alpha-programmer types who see no reason not to rewrite an existing codebase for a 20% performance gain.  Tornado on the other hand is significantly less Jersey Shore douchebaggy, but they still miss the point: we are programmers who need to get stuff done. Inventing your own HTTP client class, when Python's builtin works just fine if not better is the type of hubris that gets hotshot programmers fired in their first month.

There's also gevent, which appears to be a fork of Eventlet, but is not as well documented. Partial credit.

It's hard to find a performance or scaling related open source library that values my time. Eventlet is one of those rare few.