Ming is a Python toolkit providing schema enforcement, an object/document mapper, an in-memory database, and various other goodies developed at SourceForge during our rewrite of the site from a PHP/Postgres stack to a Python/MongoDB one.
Why Ming?
If you've come to MongoDB from the world of relational databases, you have probably been struck by just how easy everything is: no big object/relational mapper needed, no new query language to learn (well, maybe a little, but we'll gloss over that for now), everything is just Python dictionaries, and it's so, so fast! While this is all true to some extent, one of the big things you give up with MongoDB is structure.
MongoDB is sometimes referred to as a schema-free database. (This is not technically true; I find it more useful to think of MongoDB as having dynamically typed documents. The collection doesn't tell you anything about the type of documents it contains, but each individual document can be inspected.) While this can be nice, as it's easy to evolve your schema quickly in development, it's also easy to get yourself in trouble the first time your application tries to query by a field that only exists in some of your documents.
The fact of the matter is that even if the database cares nothing about your schema, your application does, and if you play too fast and lose with document structure, it will come back to haunt you in the end. At SourceForge, we created Ming (as in "...the Merciless", the villan who ruled the planet Mongo in Flash Gordon) to deal with precisely this problem. We wanted a (thin) layer on top of PyMongo that would do a couple of things for you:
- Make sure that we don't put malformed data into the database
- Try to 'fix' malformed data coming back from the database
Ming's Architecture
Ming's architecture is based on the excellent SQL toolkit SQLAlchemy. While much younger than SQLAlchemy and not including any of its code, MongoDB takes its design inspiration from there.
Ming actually consists of a number of components, including:
- The schema enforcement layer - This is 'basic' Ming, providing validation and conversion of documents on their way in and out of MongoDB. There are actually two APIs at this layer, the imperative syntax and a more declarative syntax.
- The object/document mapper - The ODM Layer extends the schema enforcement layer by providing a unit of work, identity map, and psuedo-relational concepts (one-to-many joins, for instance).
- MongoDB-in-Memory - This is layer designed to be a drop-in replacement for
the native
pymongo
driver used for testing your application without needing to have access to a MongoDB server.
Let's take a look at each of these components in turn...
Ming Schema Enforcement
A Ming schema is fairly straightforward. Below is an example containing the schema for a blog post in both the imperative and declarative syntaxes:
from ming import collection, Field, Session from ming import schema as S session = Session() # ming abstraction for database # Set up the User schema ahead-of-time User = dict(username=str, display_name=str) # "Imperative" style BlogPost = collection( 'blog.posts', session, Field('_id', S.ObjectId), Field('posted', datetime, if_missing=datetime.utcnow), Field('title', str), Field('author', User), Field('text', str), Field('comments', [ dict(author=User, posted=S.DateTime(if_missing=datetime.utcnow), text=str) ])) # "Declarative" style from ming.declarative import Document class BlogPost(Document): class __mongometa__: session=session name='blog.posts' indexes=['author.name', 'comments.author.name'] _id=Field(str) title=Field(str) posted=Field(datetime, if_missing=datetime.utcnow) author=Field(User) text=Field(str) comments=Field([ dict(author=User, posted=datetime, text=str) ])
Once you have your schema set up, you can use it to perform all the same
operations you can do in pymongo
using the manager object attached to the
attribute m
:
# Bind the session to the database from ming.datastore import DataStore session.bind = DataStore( 'mongodb://localhost:27017', database='test') # Queries BlogPost.m.find(...) # equiv. to db.blog.posts.find(...) # Inserts post0 = BlogPost(dict(... fields here ... )) post0.m.insert() # Updates using save() post1 = BlogPost.m.find({'author.username': 'rick446'}).first() post1.author.username = 'rick447' post1.m.save() # Updates using update_partial() BlogPost.m.update_partial( { '_id': ... }, { '$push': { 'comments': {... comment data...} } }) # Deletes post1.m.delete() # single document BlogPost.m.remove({...query...}) # delete by query
The Object-Document Mapper
Building on the schema enforcement layer is the object-document mapper, which provides two useful patterns:
- Unit of Work - This pattern collects the changes to your objects in memory
until a point at which you
flush()
them all to the database at once. - Identity Map - This guarantees that if you load the same database document twice, you'll get the same object in memory. This keeps you from accidentally loading the object twice, modifying it twice, and having your two sets of changes overwrite one another.
Ming also allows you to model relationships between your documents via
ForeignIdProperty
and RelationProperty
. Here is an example schema for a blog
hosting site with multiple blogs:
from ming import schema as S from ming.odm.declarative import MappedClass from ming.odm.property import FieldProperty, RelationProperty from ming.odm.property import ForeignIdProperty from ming.odm import ODMSession # wrap the session from the schema layer odm_session = ODMSession(session) class Blog(MappedClass): class __mongometa__: session = odm_session name = 'blog.blog' _id = FieldProperty(S.ObjectId) name = FieldProperty(str) posts = RelationProperty('Post') class Post(MappedClass): class __mongometa__: session = odm_session name = 'blog.posts' _id = FieldProperty(S.ObjectId) title = FieldProperty(str) text = FieldProperty(str) blog_id = ForeignIdProperty(Blog) blog = RelationProperty(Blog)
Once you have the classes defined, you can load and modify the objects, using the
odm_session
to save your changes to MongoDB:
# Queries Blog.query.find(...) # equiv. to db.blog.posts.find(...) blog = Blog.query.get(name='MongoDB Blog') blog.posts # returns a list of post objects for the blog blog.posts[0].blog # returns the blog object # Inserts post = Post(blog=blog, ...) # automatically sets blog_id # Updates post.title = 'The cool post' # Save your changes odm_session.flush() # Mark post for deletion post.delete() # Actually delete odm_session.flush()
MongoDB-in-Memory
The third main component of Ming is an implementation of the pymongo
API that
allows you to perform testing of your application without having a dependency on a
MongoDB server. To use MIM, you can swap out the creation of your
pymongo
connection:
from ming import mim import unittest class TestCase(unittest.TestCase): def setUp(self): # self.connection = Connection() self.connection = mim.Connection()
MIM's support of the pymongo
api and MongoDB query syntax has largely been
driven by the various APIs and queries used internal to SourceForge, so there are
some gaps, but these are rapidly filled when reported. MIM does provide
support, for gridfs
and mapreduce
already, for instance (mapreduce
Javascript support provided by python-spidermonkey
). And of course MIM
integrates well with the rest of Ming, allowing you to substitute a mim://
URL
for the normal mongodb://
url in your datastore:
from ming import mim from ming.datastore import DataStore import unittest class TestCase(unittest.TestCase): def setUp(self): self.ds = DataStore( 'mongodb://localhost:27017', database='test')
Conclusion
There are other good bits in MongoDB, including lazy and eager migrations, support
for the MongoDB filesystem gridfs
, WSGI auto-flushing middleware for
the ODMSession
, and more. We're also experimenting with support for GQL,
Google's query language for the Google App Engine (GAE), to facilitate porting
apps from GAE to MongoDB. Ming is actively maintained and is a mission-critical
part of the SourceForge application stack, where it's been in production use for
over 2 years.
So what do you think? Is Ming something that you would use for your projects? Have you chosen one of the other MongoDB mappers? Please let us know in the comments below!
I really like your book, it has helped me a lot! I'm having trouble with two issues from it though.
ReplyDeleteIf I have a schema such as:
Post = collection(
Field('comments', [
dict(count=float, something=float)
])
and I try:
blog.post = [
{'count': 4},
{'something': 2}
]
I will get:
ming.schema.Invalid: [0]:count: 4 is not a (, , )
How do I actually add a series of dictionary objects?
The second issue is with the mapper class and the example in your book of binding an Imperative schema with ming's ODM. If I call:
ming.odm.mapper(my_class, my_schema, odm_session)
And later try:
my_class.odm_session.flush()
I get an AttributeError: Object has no attribute odm_session. Any help would be greatly appreciated!
Thanks for the comment!
ReplyDeleteI'm not sure I understand the first question, so let me just say that if you wanted to update a post, you could do something like the following:
post = Post()
post.comments = [ { 'count': 4, 'something': 2 } ]
As for the second, I think what you want is
odm_session.flush()
Rather than
my_class.odm_session.flush()
Hope that helps!