Continuing on in my series on gevent and Python, this article deals with what you need to do when want to use the Python standard library with gevent, showing how gevent provides a way to monkey-patch the standard library to make it compatible with gevent. If you're just getting started with Gevent, you might want to read the previous articles in this series first:
And now that you're all caught up, let's get started with gevent...
Blocking is bad
You may have read that introduction and wondered to yourself why you can't use the Python standard library as-is for your gevent programs. The answer lies in the way gevent accomplishes its cooperative multithreading. At the core of gevent is an event loop. The important thing to realize about the event loop is that it is where gevent decides which greenlet will run next.
The way the event loop actually gets control of your program in gevent is that
you call one of the gevent functions that implicitly enter the loop. For
instance, conceptually, gevent.sleep
works like this:
def sleep(seconds): ev = schedule_timeout_event(seconds) schedule_greenlet(current_greenlet(), ready=ev) switch_to_event_loop()
So rather than actually blocking the current thread, as time.sleep
would do,
gevent.sleep
is switching control to the event loop and registering an event
that will fire to tell the event loop that this greenlet is ready to run.
The problem, then, with using a standard library function that doesn't know
about the event loop is that it will simply block without returning control to
the event loop. If the event loop doesn't run, no other greenlets can run. So if
you call time.sleep
in your gevent program, you'll simply freeze everything
until the sleep
returns.
Monkey-patching to the rescue
Gevent provides several "green" APIs that follow the pattern above, returning control to the event loop rather than blocking. Although you can technically build whatever you want out of these, there are some APIs in the standard library that aren't implemented in Gevent. For instance, using any of the following modules in the standard library can completely freeze a gevent-based program:
urllib
orurllib2
httplib
ftplib
poplib
imaplib
nntplib
smtplib
telnetlib
SocketServer
BaseHTTPServer
CGIHTTPServer
xmlrpclib
SimpleXMLRPCServer
Additionally, of course, if you use the standard library versions of socket
or
ssl
rather than those included in gevent, you can end up with a
globally-blocked program. Obviously the Python standard library provides a wealth
of functionality that it would be nice to have available in a gevent-based
program without giving up the cooperative concurrency in gevent. For this
purpose, gevent provides the gevent.monkey
module.
What gevent.monkey
does is replace the basic blocking operations in in the
standard library with "greened" versions. For example, if you call
gevent.monkey.patch_socket
, gevent will replace various functions and classes
in the standard library socket
module with the gevent versions.
By patching the foundational modules like socket
, ssl
, and event thread
,
other modules that build on their functionality like urllib
or xmlrpclib
automatically become green. If you want to make sure you get them all, there's a
nice function provided: gevent.monkey.patch_all()
, which will patch the
following modules in an attempt to "green" the whole standard library:
socket
ssl
os
time
select
thread
andthreading
In order to make sure that the monkey-patching works, you need to make sure you do it before any of the higher-level modules such as urllib are imported. The easiest way to do this is to start your top-level script with the following code:
import gevent.monkey; gevent.monkey.patch_all()
In most cases, this will "just work" for your program. T
Conclusion
In the Python world, monkey-patching generally has a bad name, as it's seen as an ugly hack to modify loaded code. And it is, in fact, an ugly hack to modify loaded code, but sometimes (as in the monkey-patching the standard library), it's the expedient thing to do.
So what do you think? Is the use of monkey-patching in
gevent a reasonable compromise? If not, how would you build a system that
requires some of the higher-level networking functionality in the standard
library like xmlrpclib
? I'd love to hear about it in the comments below!
No comments:
Post a Comment