In a previous post, I described how to build a web chat server with socket.io and gevent. If you're trying to actually learn gevent, socket.io, however, it's probably not the best place to start. So I figured I'd write this post and provide an overview of gevent.
[Update 2012-07-24] In response to some criticisms of the micro-benchmarks in this post, I reworked the benchmarks and wrote an updated gevent and threads post. Make sure you read that one for more perspective on greenlets vs. threads.
What is Gevent, anyway?
According to the gevent webpage,
gevent is a coroutine-based Python networking library that uses greenlet to provide a high-level synchronous API on top of the libevent event loop.
That's a succinct definition, and it identifies all the technologies and implementation architecture of gevent, but it doesn't really give a good "beginner's view." The quickest way I can think of to define it is to say
Gevent gives you threads, but without using threads.
Why not just use threads?
So why not just use threads, then? The biggest drawback of threads for me is that they're relatively resource-intensive compared to greenlets, the thread-like abstraction used in gevent. For instance, here's a minimal program that simulates a "Hello World" webserver, but without any concurrency:
import sys import socket def sequential(port): s = socket.socket() s.bind(('0.0.0.0', port)) s.listen(500) while True: cli, addr = s.accept() handle_request(cli, time.sleep) def handle_request(s, sleep): try: s.recv(1024) sleep(0.1) s.send('''HTTP/1.0 200 Ok Hello world''') s.shutdown(socket.SHUT_WR) print '.', except Exception, ex: print 'e', ex, finally: sys.stdout.flush() s.close() if __name__ == '__main__': sequential(int(sys.argv[1]))
The only thing 'special' about this script is that it does a few things to slow
down the handle_request method to make it (somewhat) more realistic. If we run
this under Apache's benchmarking tool with lots of concurrency, however, we get
abyssmal results. For instance, running with
ab -r -n 200 -c 200 http://localhost:... gives me around 11 requests per second
with lots of errors. 
Maybe we can do better with threads? We can replace the sequential function with
a threads function:
import threading def threads(port): s = socket.socket() s.bind(('0.0.0.0', port)) s.listen(500) while True: cli, addr = s.accept() t = threading.Thread(target=handle_request, args=(cli, time.sleep)) t.daemon = True t.start()
Running that with ab -r -n 200 -c 200... gives me even worse results; the
benchmark simply refuses to finish, bailing out after 10 errors. Well, it turns
out we can use gevent to give us threadlike behavior without the overhead of
threads:
import gevent def greenlet(port): from gevent import socket s = socket.socket() s.bind(('0.0.0.0', port)) s.listen(500) while True: cli, addr = s.accept() gevent.spawn(handle_request, cli, gevent.sleep)
Now with the same ab parameters we get... 1487 requests per second, which
is about what the threading demo would get if we hadn't crushed it by sending 200
requests at it concurrently.
Why not always use gevent/greenlets?
So why not always use the greenlets in gevent? Mainly, it comes down to a question of
preemption.  Greenlets use cooperative multitasking, where threads use preemptive
multitasking. What this means is that a greenlet will never stop executing and
"yield" to another greenlet unless it uses certain "yielding" functions (like
gevent.socket.socket.recv or gevent.sleep).  Threads, on the other hand, will
yield to other threads (sometimes unpredictably) based on when the operating
system decides to swap them out.
Of course, if you've been using Python for a while, you've heard something about a global interpreter lock (GIL) in Python that only allows a single thread to be executing Python bytecode at a time. So although you have threads in Python, and they give some concurrency (depending on whether the particular extension library you're using releases the GIL appropriately), threads provide less benefit than you might expect coming from C or Java.
So what else is in Gevent?
Hopefully I've given you some interest in learning more about gevent as well as some of the reasoning behind its existence. Some of the other goodies you'll find in gevent include
- Functions to monkey-patch the standard library so you can use socket.socketrather thangevent.socket, for example
- Basic servers for handling socket-based connections with your own handlers
- More fine-grained control over the greenlets you spawn
- Synchronization primitives suitable for use with greenlets
- Greenlet pools
- Greenlet-local objects (like threadlocal, but with greenlets)
- Two greenlet-based WSGI servers
In future posts, I'll give more detail about how to use gevent productively. So what do you think? Is gevent something you've already got in your toolbox? Does its ability to handle concurrency interest you? Any projects already using gevent? I'd love to hear about it in the comments below!
The thread-based example makes very little sense, and greatly detracts from the point, a thread-based solution would never spawn a thread for each request (for any kind of external-facing system, werkzeug's testing server does that but it's just for testing in order to force concurrent access). You're essentially trying to spawn 200 threads at the same time.
ReplyDeleteThe test, as designed, has nothing to do with performances or even with threads. It's terrible code failing to a DOS attack.
Thanks for the comment!
DeleteI agree that the thread-based solution is one which would be horrendous for a production system. That's why (in general) thread-based servers use a thread pool to handle requests rather than spawning a separate thread per request. I actually would have use a thread pool, if there were one available in the standard library. If I *had* used a thread pool, however, the code would still fall to a DOS attack (see the Slowloris attack http://en.wikipedia.org/wiki/Slowloris for an example).
As it stands now, however, the code presented above illustrates that threads are relatively heavy-weight relative to greenlets, as it's perfectly acceptable to spawn a greenlet for every incoming request and still service the requests quite well (or at least, quite well up to some threshold which is substantially higher than the threshold for threading).
Nonetheless, it's a good point that building a threaded system like the one above is a bad idea in any kind of system you want to expose to potentially hostile users.
Again, thanks for the comment!
So it turns out that, in addition to the problems you mentioned, I was using a buggy version of the Apache benchmark. For a better treatment, please see my next post on gevent and threads
DeleteI agree that a thread-per-request is an abysmal server design and not an argument against threads per se, but — in case other Python programmers have not stumbled across this — I would like to point out that a thread-per-request is not only the behavior of Rick's sample code here, but is the default behavior of *any* server built atop Python's socket server ThreadingMixIn (!):
Deletehttp://docs.python.org/library/socketserver.html#asynchronous-mixins
@Brandon: Thanks for the comment! I did not realize that pattern had been entrenched in the standard library :-(
DeleteActually, I guess I should backpedal a bit on the idea that a thread-per-connection is always a horrible design. It can work well if you're building something like a database server or other behind-the-firewall service where a) you can plan for the number of connections and b) your connection start-up time is not critical to application performance.
DeleteI wanted to try gevent for quite some time, thx for the intro.
ReplyDeleteEven consider the things said about threads I wanted to see how it works for me. Using your code (only arranged things different because I want to try more things later) I got not that horrible results for threads: not as good as greenlets but way better that calling the function directly (which kinda is strange, or not?)... am I missing something?
Code: https://gist.github.com/3169888
Results: https://gist.github.com/3169926
Your results seem reasonable. I did my initial benchmarking on Mac OS X Lion, which has a buggy version of ab. I updated ab and reran the tests, publishing a new post on gevent and threads that I'd encourage you to read.
DeleteAs for your results being better than calling the function directly, this is to be expected, as calling the function directly serializes everything, and both threads and greenlets at least let each other run while waiting for I/O (or sleep(), as in the benchmark).
So to summarize: I don't think you're missing anything; I was the one missing something (a working version of ab) in my initial tests.