Continuing on in my series on gevent and Python, this article gets down into practical details, showing you how to install gevent and get started with basic greenlet operations. If you're just getting started with Gevent, you might want to read the previous articles in this series first:
And now that you're all caught up, let's get started with gevent...
Installing gevent
The first step to working with gevent is installing it. Luckily if you're familiar with pip, it's a fairly straightforward process. Note however that gevent and its dependencies include C extension modules, so you'll need to have a C compiler available for the install to work. I did my installation using a virtual environment:
$ virtualenv .venv New python executable in .venv/bin/python Installing setuptools............done. Installing pip...............done. $ source .venv/bin/activate (.venv) $ pip install gevent ... lots of install messages ...
Now that you've got gevent installed, we'll move along to one of the most basic things you can do in gevent: creating greenlets.
Creating greenlets
Greenlets, sometimes referred to as "green threads," are a lightweight structure
that allows you to do some cooperative multithreading in Python without the
system overhead of real threads (like the thread
or threading
module would
use). The main thing to keep in mind when dealing with greenlets is that a
greenlet will never yield to another greenlet unless it calls some function in
gevent that yields. (Real threads can be interrupted at somewhat arbitrary times
by the operating system, causing a context switch.)
To create greenlets, you can use the gevent.spawn_*
functions. The simplest is
gevent.spawn
:
>>> import gevent >>> def simple_greenlet(*args, **kwargs): ... print 'inside greenlet with', args, kwargs ... ... >>> gevent.spawn(simple_greenlet, 1,2,3,foo=4) <Greenlet at 0x10149acd0: simple_greenlet(1, 2, 3, foo=4)> >>> gevent.sleep(1) inside greenlet with (1, 2, 3) {'foo': 4}
Note in particular how the greenlet didn't do anything until we called
sleep()
. sleep()
is one of the functions in gevent which will yield to other
greenlets. If you want to yield to other greenlets but don't care to wait a
second if there's no one ready to run, you can call gevent.sleep(0)
.
We can actually set up several greenlets to run concurrently and then sleep
while they run:
>>> greenlets = [ gevent.spawn(simple_greenlet, x) for x in range(10) ] >>> gevent.sleep(0) inside greenlet with (0,) {} inside greenlet with (1,) {} ... inside greenlet with (8,) {} inside greenlet with (9,) {}
If you're interested in waiting for the greenlets to complete, you can do so by
using the gevent.joinall
command. joinall
can also take a timeout
param
that will stop waiting for the greenlets if they don't all complete after the
given timeout. In the basic case, you just pass a list of Greenlet
objects to
joinall
:
>>> greenlets = [ gevent.spawn(simple_greenlet, x) for x in range(10) ] >>> gevent.joinall(greenlets) inside greenlet with (0,) {} inside greenlet with (1,) {} ... inside greenlet with (8,) {} inside greenlet with (9,) {}
By default, the "parent" greenlet that created a "child" greenlet won't receive
any kind of feedback about the state of that child. If you'd like some feedback,
there are other spawn_*
functions you can use:
spawn_later(secs, function, *args, **kwargs)
- This is the same asspawn
except it waits the specified number of seconds before starting the child greenlet.spawn_link(function, *args, **kwargs)
- When the child greenlet dies (either due to an exception or due to successful completion), agevent.greenlet.LinkedExited
exception subclass will be raised in the parent, either aLinkedCompleted
on successful completion,LinkedFailed
on an unhandled exception in the child, orLinkedKilled
if the child was killed by another greenlet.spawn_link_exception(function, *args, **kwargs)
- Just like linking the child, but the parent is only notified when the child dies due to an unhandled exception.spawn_link_value(function, *args, **kwargs)
- Just like linking the child, but the parent is only notified when the child completes successfully.
Normally, I use spawn_link_exception
to make sure that the greenlet
doesn't die unexpectedly without notifying its parent.
Greenlet objects
You probably noticed in the code above that the spawn_*
functions return
Greenlet
objects. You can also construct the objects manually. The big
difference between building a Greenlet
this way versus with the spawn_*
functions is that the Greenlet
doesn't start executing automatically:
>>> gl = gevent.Greenlet(simple_greenlet, 'In a Greenlet') >>> gevent.sleep(1) >>> gl.start() >>> gevent.sleep(1) inside greenlet with ('In a Greenlet',) {}
There are several useful methods on these objects that you can use to interact with a running greenlet:
value
- If a greenlet completes successfully and returns a value, it will be stored in this instance variable.ready()
- ReturnsTrue
if the greenlet has finished execution (is the result "ready"?), either successfully or due to an error.successful()
- OnlyTrue
if the greenlet completed successfullystart()
- Start the greenletrun()
- Run the greenlet (likegl.start(); gevent.sleep(0)
)start_later(secs)
- Schedule the greenlet to start laterjoin(timeout=None)
- Wait for a greenlet to complete or the timeout expires.get(block=True, timeout=None)
- Returns the return value of a greenlet, or if the greenlet raised an unhandled exception, reraises it here. Ifblock==False
and the greenlet is still running, raisegevent.Timeout
. Otherwise wait until the greenlet exits or the timeout expires (in which casegevent.Timeout
is called).kill(exception=GreenletExit, block=False, timeout=None)
- Raises an exception in the context of another greenlet. By default, this isGreenletExit
. Withblock==True
, this function will wait for the greenlet to die or for thetimeout
to expire.link(receiver=None)
(alsolink_exception
, andlink_value
) - With the default value of None, these raise exceptions in the linking greenlet similar to thespawn_link_*
functions. If you provide aGreenlet
as thereceiver
, then the exception will be raised in that greenlet's context. If you provide a function as thereceiver
, it will be called with the linked greenlet as its sole parameter.
Limiting concurrency
Although greenlets are quite low-overhead, there may be some cases in which you
wish to limit the number of greenlets created. Greenlet includes a Pool
class
that has the same spawn_*
API that we saw earlier, as well as a few extra
methods:
wait_available()
waits till there is at least one idle greenlet in the pool.full()
returnsTrue
if all the greenlets in the pool are busy.free_count()
returns the number of greenlets available to do work.
Pool
is actually a subclass of Group
, which has various methods for adding
and removing greenlets to the pool as well as some map_*
and apply_*
methods,
as well, but I won't get into them here.
Conclusion
There's a lot more to gevent that I'll cover in future, particularly in the realm of building network servers using gevent, but hopefully this article gives you a feel for the basic concurrency abstractions underlying gevent. So what do you think? Is gevent already part of your Python toolkit? Interested in trying it out? I'd love to hear what you think in the comments below!
No comments:
Post a Comment