Over the past few weeks I've been working on a service in Python that I'm calling, in the tradition of naming projects after characters in Flash Gordon, Zarkov. So what exactly is Zarkov? Well, Zarkov is many things (and may grow to more):
- Zarkov is an event logger
- Zarkov is a lightweight map-reduce framework
- Zarkov is an aggregation service
- Zarkov is a webservice
In the next few posts, I'll be going over each of the components of Zarkov and how they work together. Today, I'll focus on Zarkov as an event logger.
Technologies
So there are just a few prerequisite technologies you should know something about before working with Zarkov. I'll give a brief overview of these here.
- ZeroMQ: ZeroMQ is used for Zarkov's wire and buffering protocol all over the place. Generally you'll use PUSH sockets to send data and events to Zarkov, and REQ sockets to talk to the Zarkov map-reduce router.
- MongoDB: Zarkov uses MongoDB to store events and aggregates, so you should have a MongoDB server handy if you'll be doing anything with Zarkov. We also use Ming, an object-document mapper developed at SourceForge, to do most of our interfacing with MongoDB.
- Gevent: Internally, Zarkov uses gevent's "green threads" to keep things nice and lightweight. If you're just using Zarkov, you probably don't need to know a lot about gevent, but if you start hacking on the source code, it's all over the place (as well as special ZeroMQ and Ming adapters for gevent). So it's probably good to have at least a passing familiarity.
Installation
In order to install Zarkov, you'll need to be able to install ZeroMQ and gevent, which probably means installing the zeromq and libevent development libs. In Ubuntu, I had to install zeromq2-1 from source (which isn't too tough):
$ wget http://download.zeromq.org/zeromq-2.1.7.tar.gz $ tar xzf zeromq-2.1.7.tar.gz $ cd zeromq-2.1.7 $ ./configure --prefix=/usr/local && make $ sudo make install $ # if you're on ubuntu, this next line will work $ sudo apt-get install libevent-dev $ # otherwise you need to $ wget http://monkey.org/~provos/libevent-1.4.13-stable.tar.gz $ tar xzf libevent-1.4.13-stable.tar.gz $ cd libevent-1.4.13-stable $ ./configure --prefix=/usr/local && make $ sudo make install
Now you should be able to do a regular pip install to get everything else:
$ virtualenv zarkov $ source zarkov/bin/activate (zarkov) $ pip install Zarkov
Next, you should customize your development.yaml file. Here's a convenient example we use in testing:
bson_bind_address: tcp://0.0.0.0:6543 json_bind_address: tcp://0.0.0.0:6544 web_port: 8081 backdoor: 127.0.0.1:6545 mongo_uri: mongodb://localhost:27017 mongo_database: zarkov verbose: true incremental: 0 zmr: req_uri: tcp://127.0.0.1:5555 req_bind: tcp://0.0.0.0:5555 worker_uri: tcp://0.0.0.0 local_workers: 2 job_root: /tmp/zmr map_page_size: 250000000 map_job_size: 10000 outstanding_maps: 16 outstanding_reduces: 16 request_greenlets: 16 compress: 0 # compression level src_port: 0 # choose a random port sink_port: 0 # choose a random port processes_per_worker: null # default == # of cpus
Zarkov defines a format for an event stream which tries to be fairly generic (though our main use-case is logging SourceForge events for later aggregation). A Zarkov event is a BSON object containing the following data:
- timestamp (datetime) : when did the event occur?
- type (str): what is the type of event?
- context (object): in what context did the event occur? On SourceForge, this includes the project context, the user logged in, the IP address, etc.
- extra (whatever): this is purely up to the event generator. It might be a string, integer, object, array, whatever. (It should be supported by BSON, of course.)
The Zarkov events are stored in a MongoDB database (again with the Flash Gordon references). Assuming you've already installed Zarkov, to run the server you'd execute the following command::
(zarkov) $ zcmd -y development.yaml serve
Now to test, you can use the file zsend.py (included with Zarkov) to send a message to the server::
(zarkov) $ echo '{"type":"nop"}' | zsend.py tcp://localhost:6543
To confirm it got there, you can use the 'shell' subcommand from zcmd:
(zarkov) $ zcmd -y development.yaml shell
Then, in the shell you're given, execute the following commands:
In [1]: ZM.event.m.find().all() Out[1]: [{'_id': ObjectId('4e2723eeb240217416000001'), 'aggregates': [], 'context': {}, 'jobs': [], 'timestamp': datetime.datetime(2011, 7, 20, 18, 52, 30, 272000), 'type': u'nop'}]
(Your _id value will probably be different). To actually use Zarkov as an event logger, you'll probably want to actually send the ZeroMQ messages yourself. Zarkov includes a client to do just that. From the zcmd shell:
In [1]: from zarkov import client In [2]: conn = client.ZarkovClient('tcp://localhost:6543') In [3]: conn.event('nop', {'sample_context_key': 'sample_context_val'}) In [4]: ZM.event.m.find().all() Out[4]: [{'_id': ObjectId('4e2723eeb240217416000001'), 'aggregates': [], 'context': {}, 'jobs': [], 'timestamp': datetime.datetime(2011, 7, 20, 18, 52, 30, 272000), 'type': u'nop'}, {'_id': ObjectId('4e2725a8b240217483000001'), 'aggregates': [], 'context': {u'sample_context_key': u'sample_context_val'}, 'extra': None, 'jobs': [], 'timestamp': datetime.datetime(2011, 7, 20, 18, 59, 52, 756000), 'type': u'nop'},
If you want to customize things further, the ZarkovClient code is actualy quite
short:
'''Python client for zarkov.''' import zmq import bson class ZarkovClient(object): def __init__(self, addr): context = zmq.Context.instance() self._sock = context.socket(zmq.PUSH) self._sock.connect(addr) def event(self, type, context, extra=None): obj = dict( type=type, context=context, extra=extra) self._sock.send(bson.BSON.encode(obj)) def event_noval(self, type, context, extra=None): from zarkov import model obj = model.event.make(dict( type=type, context=context, extra=extra)) obj['$command'] = 'event_noval' self._sock.send(bson.BSON.encode(obj)) def _command(self, cmd, **kw): d = dict(kw) d['$command'] = cmd self._sock.send(bson.BSON.encode(d))
"... a service in Python that I'm calling, in the tradition of naming projects after characters in Flash Gordon, Zarkov."
ReplyDeleteI was not aware of this tradition. Can you elaborate?
It's just my own tradition. Since Mongo is the planet ruled by the emperor Ming the Merciless in Flash Gordon, I named my MongoDB library Ming. Zarkov is the scientist that helps Flash and Dale get to Mongo, so I figured I'd name an event logger that puts stuff into Mongo "Zarkov".
ReplyDeleteInstall Zarkov's dependencies with Homebrew on a Mac:
ReplyDelete$ brew install zeromq
$ brew install libevent
After that you'll need to create a journal directory before starting up zarkov:
(zarkov) $ mkdir journal
(zarkov) $ zcmd -y development.yaml serve
A helpful reader pointed out that you can get into trouble installing ZeroMQ in /usr/local and trying to just 'pip install Zarkov'. Here's the solution I was able to come up with to get this to work. Just before the 'pip install Zarkov' command, you'll need to install pyzmq using the following command line:
ReplyDeletepip install pyzmq --install-option='--zmq=/usr/local'
Hopefully that gets you around any installation hiccups you run into.