Just a little Python: Building Web Applications with Gevent's WSGI Server

Wednesday, August 08, 2012

Building Web Applications with Gevent's WSGI Server

Continuing on in my series on gevent and Python, this article discusses how to use gevent to power your Python WSGI web applications. If you're just getting started with Gevent, you might want to read the previous articles in this series first:

And now that you're all caught up, let's jump into gevent's WSGI support...

WSGI refresher

For those not familiar with the Python web server gateway interface (WSGI), I'll provide a very abbreviated intro here. In WSGI, your application consists of a single function that takes environ and start_response arguments. That function will be called once for each web request received by your server. The environ argument is a Python dictionary that holds information about the request and about the server software itself. The start_response argument is a function that your application should call to set status and headers on the HTTP response.

Once your application has called start_response, you can either return an iterable such as a list, or simply begin yielding strings to send back as the body of the response. A simple "hello world" WSGI application might look like the following:

def hello_world(environ, start_response):
    start_response('200 OK', [('Content-Type', 'text/html')])
    return [ '<b>Hello world!</b>\n' ]

If you prefer to use the yield statement instead, your application might look more like this:

def hello_world(environ, start_response):
    start_response('200 OK', [('Content-Type', 'text/html')])
    yield '<b>Hello world!</b>\n'

The difference between the two approaches is that if you yield strings a little at a time, you can send a large amount of data back to the client without needing to buffer it all in memory at once if the WSGI server you're using supports it.

Running your WSGI app inside gevent

Gevent actually includes two separate servers capable of calling Python WSGI web applications, located in the gevent.wsgi and gevent.pywsgi modules:

gevent.pywsgi has a WSGI server implemented natively in gevent, and it supports streaming responses, HTTP pipelining, and SSL.
gevent.wsgi has a WSGI server based the HTTP server in libevent, so it's quite a bit faster than pywsgi, but it doesn't support streaming responses, HTTP pipelining, nor SSL.

If we want to take our WSGI application above and wrap it in the pywsgi server, the approach is quite similar to a "regular" TCP StreamServer discussed in the previous article, except that our WSGI application serves the purpose of the handler in the StreamServer case. Our entire application, then is the following:

from gevent import pywsgi

def hello_world(environ, start_response):
    start_response('200 OK', [('Content-Type', 'text/html')])
    yield '<b>Hello world!</b>\n'

server = pywsgi.WSGIServer(
    ('', 8080), hello_world)

server.serve_forever()

Now we can request a page and see that everything's working just fine.

$ curl http://localhost:8080
<b>Hello world!</b>

wsgi versus pywsgi

If we run curl with the -v argument, we can see the headers sent back by the server. We can observe what the WSGI server's doing by paying attention to the lines starting with < and *:

$ curl -v http://localhost:8080
...
< HTTP/1.1 200 OK
< Content-Type: text/html
< Date: Sat, 04 Aug 2012 19:34:51 GMT
< Transfer-Encoding: chunked
< 
* Connection #0 to host localhost left intact
* Closing connection #0

There are a couple of things of interest here:

The Transfer-Encoding: chunked indicates that we can stream out data a little bit at a time, without having to buffer everything in server memory.
The next-to-last line indicates that the connection is left intact. This means that the server supports HTTP pipelining.

If we change our code just a bit to use the wsgi module instead (leaving everything else the same), we'll see a different set of headers:

$ curl -v http://localhost:8080
...
< HTTP/1.1 200 OK
< Content-Type: text/html
< Content-Length: 19
< Server: gevent/0.13 Python/2.7
< Connection: close
< Date: Sat, 04 Aug 2012 19:41:49 GMT
< 
* Closing connection #0

The points important to note here include:

The Content-Length: 19 header is present, and the Transfer-Encoding: chunked is missing. This is our clue that the server has buffered the entire response in memory before beginning to respond.
The wsgi server is quite forthcoming about its identity with the Server line, going so far as to say which version of gevent and Python are being used. This is a potential security problem, as it allows an attacker to target vulnerabilities to the particular server software used.
The Connection: close header and the lack of anything indicating that the connection was left intact indicates that HTTP pipelining is not supported.

Given the functionality differences between the servers illustrated by the headers they return, you might wonder why you'd ever use wsgi instead of pywsgi. The answer lies in the performance difference between the two servers. In my testing, wsgi could handle 3800 requests per second, while pywsgi could handle around 2400 requests per second, for a speedup of 1.59. (For those interested, that's using ab -n 10000 -c 1000 http://localhost:8080/ as a micro-benchmark.)

So if you have an application where the performance is limited by the WSGI server overhead, you might want to consider using the wsgi server. In my experience, however, even fairly trivial Python WSGI applications have performance that is measured (at best) in the hundreds of requests per second. Put another way, the difference in overhead introduced by wsgi versus pywsgi is less than 160 microseconds, and you give up quite a bit of functionality in the process.

Using Python web frameworks with gevent

In most cases, you'll be using a web framework to build your larger Python web applications. In most if not all cases, those frameworks provide a WSGI application that you can plug directly into gevent. Some of the relevant links are below:

Django - In particular, note that projectname/wsgi.py contains a WSGI application that you can use with gevent's WSGIServer.
Pyramid - Pyramid provides a make_server command which is used in the documentation, but you can as easily use the WSGIServer from gevent.
Flask actually has gevent instructions right on the web page.
TurboGears - Since TurboGears uses PasteDeploy, you can use paste-gevent to wrap your application.

In most cases, you don't need to make any significant changes to your web applications to make it work correctly with gevent, particularly if you use the gevent monkey-patching module.

Conclusion

Using gevent as a Python WSGI server has one great side-effect: you can spin off greenlets to do background processing whenever you want. Of course, it's also nice to be able to handle thousands of simultaneous connections without seriously taxing your server. One place where this is useful is in web socket or long-polling applications where you need to support lots of simultaneous connections.

There are a couple of modules, gevent-websocket and gevent-socketio, that make this type of application work well in a gevent WSGI wrapper. I've mentioned them in previously, but I'll be going into some more depth in upcoming articles, so watch out for them.

So what do you think? Do you already use the gevent server to host your Python WSGI applications? Is it something that you'd consider doing? Anyone using gevent for long-lived web clients and not using websockets or socketio? I'd love to hear about it in the comments below!

14 comments:

Greg4:32 PM
I'm new to all this. How is gevent's WSGI server related to Gunicorn? Does one use the other? Are they 'competing' products?
ReplyDelete
Replies
Unknown5:02 AM
Cool!
now I know how to write HTTP streaming server for testing purposes
thanks a lot for post !!!
ReplyDelete
Replies
Rakesh Patil12:53 AM
how to serve the static pages .. ?? is there a method in which gevent can be run as standalone server for both wsgi and its static contents.
ReplyDelete
Replies
GeekBrit11:29 AM
This is a great write up, thank you. I think that version changes are conspiring against me when trying to use these techniques - I'm getting this error:

AttributeError: GreenSocket has no such option: _GREENSOCKET__IN_SEND_MULTIPART

Google isn't much help - I'm getting one result, which is a bug report for locust. The maintainer simply removed gevent_zmq from the build.

I tried using zmq directly, but we get a lockup in gevent.sleep in zmq_producer - gevent.sleep never returns.

Any ideas how best to proceed?
ReplyDelete
Replies
sailoratbay10:48 AM
The exception is raised by pyzmq 13.x, see https://github.com/locustio/locust/issues/58.
Remove pyzmq and reinstall it explicitely (download https://pypi.python.org/packages/source/p/pyzmq/pyzmq-2.2.0.1.zip#md5=31d4100d62e352e5e19824ded45aaac9 and run setup.py) will fix the problem...
ReplyDelete
Replies

Add comment

Useful Resources

Interested in practical MongoDB programming?

MongoDB Applied Design Patterns
is available now, both in ebook and dead-tree form. In it, you'll see how to use MongoDB effectively in fields from real-time analytics to content management systems and more. The examples are all in Python, so readers of this blog should have no problem picking it all up.

Want to learn MongoDB using Python?

I just released an 84-page ebook MongoDB with Python and Ming to help you get started. In it, I cover everything from installing MongoDB for the first time, basic pymongo usage, MongoDB aggregation including MapReduce and the new aggregation framework, and GridFS. You'll also learn about Ming, the object-document mapper we built at SourceForge to accelerate our development beyond what we could do with PyMongo.

Want more personalized training?

I'm available for customized onsite Python and MongoDB training classes. You can sign up here for more information on this and other classes I'll be offering in the future including online and public training.

Just a little Python

Wednesday, August 08, 2012

Building Web Applications with Gevent's WSGI Server

WSGI refresher

Running your WSGI app inside gevent

wsgi versus pywsgi

Using Python web frameworks with gevent

Conclusion

14 comments:

Search

Useful Resources

Interested in practical MongoDB programming?

Want to learn MongoDB using Python?

Want more personalized training?

Pages

Rick's Resources

FeedBurner FeedCount

Email

Labels

Links

Blog Archive

Email

Popular Posts