Just a little Python: Very Simple WSGI Overview

Categories: python, programming, wsgi, web
In all the posting and hype about full-featured frameworks, you may have overlooked a very small "un-framework", the Python web server gateway interface (WSGI). It's generally an option for deploying the large frameworks such as TurboGears or Django. What follows is a very simple and brief overview of how you can create a WSGI-compliant application server.

First off, the WSGI specification itself is a decent read, and I'd be amiss if I didn't at least mention it. Now, on to the simple overview!

Overview

First off, you need to realize that WSGI is exactly what its name implies: an interface. The best way I've found to think of it is "CGI for Python." In CGI, the shell is invoked to run some script. The shell's environment is populated with values from the HTTP request, and the script's output is returned to the client. WSGI is similar, substituting a Python function for the script, a Python dict for the shell environment, and skipping the shell altogether. A basic WSGI application server has the following outline:


def MyApplication(environ, start_response):
    try:
        ....maybe do some stuff in response to the environ arg...
        write_fn = start_response('200 OK', [('Content-type', 'text/html')...])  # send headers
        ....maybe do some more stuff....
        ... EITHER ...
        yield some things
        .... OR ....
        return 
        .... OR ....
        write_fn(response_text)   # deprecated
    except:
        start_response('500 OOPS', [('Content-type', 'text/html')...], sys.exc_info)
        ... yield, write, or return the text of the error page ...

Your application server, then, is just a function (or other callable) that takes two arguments, an "environment" and a "start_response". In the recommended implementation, your server will either return an iterable (generally a list of strings) or itself be an iterable (generally, a generator). The minimal "hello, world!" application is below:


def MyApplication(environ, start_response):
    start_response('200 OK', [('Content-type', 'text/plain')])
    yield "Hello, world!"

The "environment" is just a dict of strings, much like the CGI environment. The values available are summarized below. The "start_response" is a callable that your server must call to send the HTTP Headers. You can call it up to twice, once for "normal" headers, and once for "error" headers. If you call it a second time, you must call it before generating any output, and you must call it with an "exc_info" object. The original headers (if there were any) will be overwritten by the new headers.

To do anything useful, you'll need to parse two main variables in "environ": "PATH_INFO" and "QUERY_STRING". "PATH_INFO" gives you the "rest of the path" after the mount point for your application server, and "QUERY_STRING" gives you - you guessed it - the query string. You can then implement whatever kind of URL->object mapping your heart desires, whether it be CherryPy-style object publishing, or Django-style regular expressions. You could use the functions in Python's standard cgi module to parse the query string, but Ian Bicking has a great tutorial on how to use Paste to simplify matters quite a bit. All the other WSGI variables that are available in the environment are documented below.

Environment

The variables available in the environ dict are summarized below. For the examples, assume the user requested (using GET) "http://server.com/some/path/myserver/more/path?query_args", and that the application server was mounted at "http://server.com/some/path/myserver".

Variable	Example	Description	Always Present?
REQUEST_METHOD	"GET"	HTTP method, generally GET or POST	Yes
SCRIPT_NAME	"/some/path/myserver"	Location in URL of application server	No - if application server is mounted at server root
PATH_INFO	"/more/path"	The rest of the path after the application root	No - for instance, if user requests "http://server.com/some/path/myserver"
QUERY_STRING	"query_args"	Anything after the "?" in the URL	No
CONTENT_TYPE	<absent>	Any Content-Type fields in the HTTP request	No
CONTENT_LENGTH	<absent>	Any Content-Length fields in the HTTP request	No
SERVER_NAME	"server.com"	The server name part of the URL	Yes
SERVER_PORT	"80"	The server port part of the URL	Yes
SERVER_PROTOCOL	"HTTP/1.1"	The request HTTP protocol	Yes
HTTP_*	<absent>	Other HTTP headers in request	No
wsgi.version	The tuple (1,0)	WSGI version ID	Yes
wsgi.url_scheme	"http"	The initial part of the URL	Yes
wsgi.input	<empty file-like object>	An object from which the request body can be read - very useful for POSTs	Yes
wsgi.errors	<file-like object>	A file-like object to which the application server can write text errors to be logged by the web server	Yes
wsgi.multithread	False	Whether the application may be simultaneously invoked in a multithreaded manner	Yes
wsgi.multiprocess	True	Whether the application may be simultaneously invoked in a multiprocess manner	Yes
wsgi.run_once	False	Tune the application to expect to only run once (e.g. turn off caching)	Yes

Just a little Python

Friday, February 17, 2006

Very Simple WSGI Overview

Overview

Environment

3 comments:

Search

Useful Resources

Interested in practical MongoDB programming?

Want to learn MongoDB using Python?

Want more personalized training?

Pages

Rick's Resources

FeedBurner FeedCount

Email

Labels

Links

Blog Archive

Email

Popular Posts