Just a little Python: 2006

Friday, April 14, 2006

Papr.info

I have just finished getting my new project to the "usable" stage, so I thought I'd go ahead and let you know about it. It's an academic paper sharing/tagging site (slightly similar to del.icio.us, but for papers). I guess it's kind of a "mash-up" since it uses at least 3 different sites' APIs (Google, CiteSeer, and Citidel). Take a look and tell me what you think!

Thursday, March 16, 2006

A hypothesis on why Pythonistas reinvent wheels

Well, maybe the title's more inflammatory than need be. This post is kind of a follow-up to Three Reasons You Shouldn't Write Your Own Web Framework, but it's a little broader than that.

My hypothesis is that Python has two "things" that appear together in few (if any!) other languages right now, and that the combination of these ingredients makes the proliferation of web frameworks (and libraries, and RSS readers, and wikis, etc....) nearly unavoidable.

Python has power

First off, Python is a powerful, expressive language. Sure, the Lispers out there deride its lack of macros, Rubyists bemoan its "impure OO", and so on. But Python is expressive enough. Enough to lower the amount of pain it inflicts upon programmers to where some would rather build a framework/library that's just right instead of grabbing something else that's out there. It's just so stinkin' easy to code up something quickly that will do exactly what you want. (Well, almost what you want, and you know how to avoid those bugs, anyway. ;-) )

But there are other languages that are more expressive, concise, "powerful." Why have they, too, not embarked upon the path of infinite diversity?

Python has programmers

Python is one of the most widely-used dynamic languages in the world today. Yes, Perl is more widely used. [Edit 4/1/07: In my completely uninformed opinion as someone who has never used Perl for anything even remotely serious,] Perl also makes it very easy to get yourself in trouble very quickly when building a large system. Not that it can't be done. Just that it kind of fails the "powerful" criterion above. Ruby is "getting there," or it may already have surpassed Python's popularity (hard to know for sure with such scientific measurements as "# of Google search results", "# of new books", "# of SourceForge/FreshMeat projects", etc. But Ruby has not had the time to develop the same critical mass of programmers with a hankering to build their own framework. Haskell, ML, Smalltalk, and so forth just don't have the base, either. Lispers built quite a few frameworks/libraries back in the day, but these were mainly in the realm of AI, so we won't hear much about them until AI comes back in vogue.

What to do?

I can't say I really know. After seeing Ian's tutorial on building your own WSGI framework, I see the allure. Working in academia, I also feel the urge to build upon prior research. Maybe it's the nature of the beast (having a widely-used, expressive language). What do you think?

Monday, March 06, 2006

TG Admin Interface (part II)

Well, the week was less-than-productive on the automated TG admin pages front. I did manage to extract the CRUD code from the model class, however. Here is the (new) code to create an admin page for a given SQLObject model:


import model
from crud import crudbase, fields

class DistributionCenterView(crudbase):
    modelClass=model.DistributionCenter
    displayName='Distribution Center'
    fields=fields('name', 'note',
                  'address.street1', 'address.street2',
                  'address.city', 'address.state', 'address.zip',
                  'stores', 'trucks')
    key='name'

this is for the following SQLObject model:


class DistributionCenter(SQLObject):
    def destroySelf(self):
        self.address.destroySelf()
        SQLObject.destroySelf(self)

    name=StringCol(default=uniqueValue('DistributionCenter', 'name', 'ctr'),
                   alternateID=True)
    note=StringCol(default='')
    address=ForeignKey('Address', default=newAddressID, cascade=False)
    stores=MultipleJoin('Store')
    trucks=MultipleJoin('Truck')

The resulting list view looks a little like this (simplified a bit for blogger):

	Name	Note	Street1	Street2	City	State	Zip	Stores	Trucks
Edit Delete	Dist Ctr A	A note on dca	123 Main Street		Marietta	GA	30062	Home Dep 3	SXG 456,ABC 123
Edit Delete	Center 2	Another Center	456 Beechwood Ave.		Marietta	GA	30067	Store Num 1,store_2	333 HFF

As you can see, foreign keys and multiple joins are handled, as well as columns from foreign tables (via the "address.street1..." stuff). I'm still working on cleaning things up, but if this code would be a useful starting point for you, drop me a comment and I'll email you a copy. Alternatively, if there are several people interested, we'll set up a project on SourceForge.

Monday, February 27, 2006

Django-like automated admin interface

Categories: python, programming, turbogears,

As I continue to upgrade ConsulTracker and work on my new (not-yet announced) project, I often find myself wanting a nice CRUD-like interface, something akin to what I’ve heard Django has. While less masochistic developers might go for the “real deal” of Django, I chose to begin adding admin-style CRUD pages for TurboGears, the Python web framework with which I’m most familiar.

Requirements

Now, TurboGears already has something called “FastData”, but I’ve found it somewhat limiting. What I started with, then, is a set of requirements:

Table View

There should be a (HTML) table view of the (SQL) table which shows all rows of the (SQL) table.

The columns shown for each table must be customizable with a minimum of custom code.

When a column contains a foreign key, something “useful” must show up in that column (e.g. not the numeric “id” column of the foreign table)

The table should have the capability of displaying columns from foreign-key’d tables as native columns. (This is kind of like creating an SQL view containing columns from multiple tables.)

The table should have a link for editing each row, a link for marking a row for deletion, and a button for actually deleting the marked items. This is really a pet peeve of mine. GET-style links should not modify data in a well-designed web application. That can lead to all sorts of nastiness, especially when dealing with web spiders that follow all links. So I’ll keep all the data-modifying stuff in buttons.

From these requirements, I determined that I needed some kind of information, call it “CRUD metadata” or crudmeta for short, which contained the following information for each SQLObject model I want to view:

The fields to include in the table (note that these can be sub-fields, as in “address.street1”, etc.)

The “name” of each row in the table—this is for displaying the row as a “foreign key” in another table.

Edit View

There should be an “edit” view which allows for the following functionality:

Each SQLObject property refrenced in “crudmeta” should have a place on the form.

The fields in the form should make sense for their data type. This means:
- TextFields for StringCol, IntCol, FloatCol, etc. (with data validation)
- CalendarDate[Time]Picker for Date[Time]Col
- Select fields for ForeignKey – populated with the “name” from crudmeta, of course!
- Tables for MultiJoin fields with the similar functionality as the “Table View” above.

AJAX-y goodness to allow you to “drill down” in the object hierarchy without losing your context.

Note that some (most) of these fields have TurboGears 0.9 widgets-style names. Not a coincidince. I want to reuse as much as possible from existing TurboGears development so as to avoid “re-inventing the wheel.”

Status & next steps

As of today (2006–02-27), the basic functionality is there. It’s not pretty, it doesn’t handle errors well, it needs refactoring badly, and it mixes model and view a bit from MVC, but it exists. My current plan is to clean it up a bit this week and (hopefully) make it available by next weekend. To whet your appetite, here is an (abbreviated) model from my new application with the CRUD annotations present. (Currently, this is all the custom code you need in order to create the above-described interface.)



  class Store(SQLObject):
      class crudmeta:
          fields=crud.fields(‘name’, ‘note’,
                             ‘address.street1’, ‘address.street2’,
                             ‘address.city’, ‘address.state’, ‘address.zip’,
                             (‘distributionCenter’, ‘Distribution Center’), 
                             ‘orders’)
          key=‘name’
  
      name=StringCol(default=uniqueValue(‘Store’, ‘name’, ‘store’),
                     alternateID=True)
      note=StringCol(default=’’)
      distributionCenter=ForeignKey(‘DistributionCenter’,default=None, 
                                    cascade=None)
      address=ForeignKey(‘Address’,default=newAddressID)
      orders=MultipleJoin(‘StoreOrder’)
      deliveries=MultipleJoin(‘Delivery’)

Friday, February 17, 2006

Very Simple WSGI Overview

Categories: python, programming, wsgi, web
In all the posting and hype about full-featured frameworks, you may have overlooked a very small "un-framework", the Python web server gateway interface (WSGI). It's generally an option for deploying the large frameworks such as TurboGears or Django. What follows is a very simple and brief overview of how you can create a WSGI-compliant application server.

First off, the WSGI specification itself is a decent read, and I'd be amiss if I didn't at least mention it. Now, on to the simple overview!

Overview

First off, you need to realize that WSGI is exactly what its name implies: an interface. The best way I've found to think of it is "CGI for Python." In CGI, the shell is invoked to run some script. The shell's environment is populated with values from the HTTP request, and the script's output is returned to the client. WSGI is similar, substituting a Python function for the script, a Python dict for the shell environment, and skipping the shell altogether. A basic WSGI application server has the following outline:


def MyApplication(environ, start_response):
    try:
        ....maybe do some stuff in response to the environ arg...
        write_fn = start_response('200 OK', [('Content-type', 'text/html')...])  # send headers
        ....maybe do some more stuff....
        ... EITHER ...
        yield some things
        .... OR ....
        return 
        .... OR ....
        write_fn(response_text)   # deprecated
    except:
        start_response('500 OOPS', [('Content-type', 'text/html')...], sys.exc_info)
        ... yield, write, or return the text of the error page ...

Your application server, then, is just a function (or other callable) that takes two arguments, an "environment" and a "start_response". In the recommended implementation, your server will either return an iterable (generally a list of strings) or itself be an iterable (generally, a generator). The minimal "hello, world!" application is below:


def MyApplication(environ, start_response):
    start_response('200 OK', [('Content-type', 'text/plain')])
    yield "Hello, world!"

The "environment" is just a dict of strings, much like the CGI environment. The values available are summarized below. The "start_response" is a callable that your server must call to send the HTTP Headers. You can call it up to twice, once for "normal" headers, and once for "error" headers. If you call it a second time, you must call it before generating any output, and you must call it with an "exc_info" object. The original headers (if there were any) will be overwritten by the new headers.

To do anything useful, you'll need to parse two main variables in "environ": "PATH_INFO" and "QUERY_STRING". "PATH_INFO" gives you the "rest of the path" after the mount point for your application server, and "QUERY_STRING" gives you - you guessed it - the query string. You can then implement whatever kind of URL->object mapping your heart desires, whether it be CherryPy-style object publishing, or Django-style regular expressions. You could use the functions in Python's standard cgi module to parse the query string, but Ian Bicking has a great tutorial on how to use Paste to simplify matters quite a bit. All the other WSGI variables that are available in the environment are documented below.

Environment

The variables available in the environ dict are summarized below. For the examples, assume the user requested (using GET) "http://server.com/some/path/myserver/more/path?query_args", and that the application server was mounted at "http://server.com/some/path/myserver".

Variable	Example	Description	Always Present?
REQUEST_METHOD	"GET"	HTTP method, generally GET or POST	Yes
SCRIPT_NAME	"/some/path/myserver"	Location in URL of application server	No - if application server is mounted at server root
PATH_INFO	"/more/path"	The rest of the path after the application root	No - for instance, if user requests "http://server.com/some/path/myserver"
QUERY_STRING	"query_args"	Anything after the "?" in the URL	No
CONTENT_TYPE	<absent>	Any Content-Type fields in the HTTP request	No
CONTENT_LENGTH	<absent>	Any Content-Length fields in the HTTP request	No
SERVER_NAME	"server.com"	The server name part of the URL	Yes
SERVER_PORT	"80"	The server port part of the URL	Yes
SERVER_PROTOCOL	"HTTP/1.1"	The request HTTP protocol	Yes
HTTP_*	<absent>	Other HTTP headers in request	No
wsgi.version	The tuple (1,0)	WSGI version ID	Yes
wsgi.url_scheme	"http"	The initial part of the URL	Yes
wsgi.input	<empty file-like object>	An object from which the request body can be read - very useful for POSTs	Yes
wsgi.errors	<file-like object>	A file-like object to which the application server can write text errors to be logged by the web server	Yes
wsgi.multithread	False	Whether the application may be simultaneously invoked in a multithreaded manner	Yes
wsgi.multiprocess	True	Whether the application may be simultaneously invoked in a multiprocess manner	Yes
wsgi.run_once	False	Tune the application to expect to only run once (e.g. turn off caching)	Yes

Thursday, February 16, 2006

Effective Decorators

Michele Simionato has written a very nice-looking module that allows decorators to be used quite effectively:

The aim of the decorator module it to simplify the usage of decorators for the average programmer, and to popularize decorators usage giving examples of useful decorators, such as memoize, tracing, redirecting_stdout, locked, etc.

Even if you don't end up using her module, I think the documentation is a great read, and a good example of how to use your own decorators more effectively.

Tuesday, February 14, 2006

Topics

Now, I'm a guy who can go on and on about a technical topic ad nauseum. (Ask my wife!) But sometimes it takes a while for me to get started. So if there are any topics you'd like me to cover here, particularly regarding Python, please post it in the comments. I'd like to make this blog a useful resource, but my creativity-challenged brain may need a bit of help. Thanks!

Sunday, February 12, 2006

ConsulTracker.com is now live!

ConsulTracker.com is now live and accepting "real" customers! To sign up, just go to our signup page. Sincere thanks go out to our beta customers for helping us "get the bugs out." We are offering a free one-month trial period (4 user maximum), or you can sign up for just $6 per user per month.

If you just want to see the features before comitting, we also have a demo site set up. You can log into the demo site with userid "test", password "test", selecting "Test Domain" on the login screen.
And of course, if you have suggestions or feature requests, please feel free to comment on this blog, post a note in our forums, or enter a bug in our bug-tracking system.

Wednesday, February 08, 2006

SQL in SQL

Categories: python, sql, sqlobject, programming

One thing I think would be useful is the ability to create a SQL database "on top" of another SQL database. This would allow you to write an application where the user creates their own DB schema, somewhat like MS Access, but the underlying database schema doesn't change. I'm exploring this a bit right now for my next project.

The basic idea is to create four tables: vtable, vcol, vrow, and vcell. vtable is a "virtual table", vcol is a "virtual column," and so on. I'm currently using SQLObject for the DB layer, so all the examples will use SQLObject, even though I'm not using it to the extent it could be used.

The setup code for a table is below:


class VTable(SQLObject):
    tabname=StringCol(alternateID=True)
    vcols=MultipleJoin('VCol')
    vrows=MultipleJoin('VRow')

    @classmethod
    def create(klass, name, **columns):
        t=klass(tabname=name)
        for colname, coltype in columns.items():
            t.addColumn(colname, coltype)
        return t

    def addColumn(self, colname, coltype):
        VCol(colname=colname, coltype=coltype, vtable=self)

    def getColumn(self, colname):
        for c in self.vcols:
            if c.colname == colname: return c
        return None

    def insert(self, **values):
        row = VRow(vtable=self)
        for colname, value in values.items():
            col=self.getColumn(colname)
            cell=VCell(vcol=col, vrow=row, stringValue=value)
        return row

Fairly straightforward: a vtable is a collection of vcols & vrows. I also include helper functions for performing table creation and inserts. Next is the column definition:


class VCol(SQLObject):
    colname=StringCol(alternateID=True)
    coltype=StringCol(default='str')
    vtable=ForeignKey('VTable')
    vcells=MultipleJoin('VCell')

Even better: a column is just a "type" (currently ignored), a name, and a collection of vcells. The Row object is also simple:


class VRow(SQLObject):
    vtable=ForeignKey('VTable')
    vcells=MultipleJoin('VCell')

    def __getitem__(self, colname):
        for cell in vcells:
            if cell.vcol.colname == colname: return cell.stringValue
        return None

    def __setitem__(self, colname, value):
        cell = self[colname]
        if cell is None:
            vcol=self.vtable.getColumn(colname)
            cell=VCell(vcol=vcol, vrow=self, stringValue=value)
        else:
            cell.stringValue = value

A row's main purpose is to have an id value which ties all the cells of a row together. Without further ado, here is the vcell class:

       
class VCell(SQLObject):
    vcol=ForeignKey('VCol')
    vrow=ForeignKey('VRow')
    stringValue=StringCol(default=None)

A vcell is simply a vrow & vcol reference, and an attached data value. Now, what can we do with this? Well, we can create vtables and insert values into them fairly easily. Suppose we have created and populated a table as follows:


t=VTable.create('mytable', col1='str', col2='str', col3='date')
for i in range(3):
    t.insert(col1='foo%d' % i, col2='bar%d' % i, col3='baz')

This is equivalent to


CREATE TABLE mytable (col1 TEXT, col2 TEXT, col3 TEXT); 
INSERT INTO mytable(col1,col2,col3) VALUES ('foo0', 'bar0', 'baz');
INSERT INTO mytable(col1,col2,col3) VALUES ('foo1', 'bar1', 'baz');
INSERT INTO mytable(col1,col2,col3) VALUES ('foo2', 'bar2', 'baz');

So what if we want to do something like SELECT col1, col2 FROM mytable with the vtable? What does the SQL look like for that? Can you even do it in a single SQL statement? It turns out you can:


SELECT vcell0.string_value AS col1,vcell1.string_value AS col2 
FROM vtable vt0, vcol vcol0,vcell vcell0, vcol vcol1,vcell vcell1, 
WHERE vt0.tabname='mytable' 
  AND vcol0.vtable_id = vt0.id AND vcol0.colname='col1' AND vcell0.vcol_id=vcol0.id 
  AND vcol1.vtable_id = vt0.id AND vcol1.colname='col2' AND vcell1.vcol_id=vcol1.id 
  AND vcell0.vrow_id=vcell1.vrow_id;

If we run this, we get the expected results:


>>> conn.queryAll(sql)
[('foo0', 'bar0'), ('foo1', 'bar1'), ('foo2', 'bar2')]

But this is ugly and hard to write and read. What if there were a function to build a virtualized query automatically? Well, I've created such a function. Right now, it is very kludge-y and has limited functionality (it doesn't support a virtual "WHERE" clause yet, for example.) But it's a starting point. What I'd really like to have is a sql-ish interface that allows complete virtualization of table creation, query, update, etc., all without modifying the underlying database schema. If anyone is interested in possibly contributing to such a project, let me know in the comments below, and I'll set up a project on SourceForge.

Oh, and here's the "virtual query builder".


def vquery(columns, tables):
    '''vquery(columns, tables) - create a (real) query from a virtual one
    
    columns    : list of (table,column) tuples
    tables     : list of table names
    '''
    tabtrans = {} # tabtrans[tabname] = tab_label
    coltrans = {} # coltrans[colname] = (col_label, cell_label)
    num_tables = 0
    num_columns = 0
    sql='SELECT '
    real_tables, real_columns, real_cells = [], [], []
    for t in tables:
        tab_label = 'vt%d' % num_tables
        num_tables += 1
        tabtrans[t] = tab_label
        real_tables.append(tab_label)
    selectclause=[]
    for t,c in columns:
        col_label = 'vcol%d' % num_columns
        cell_label = 'vcell%d' % num_columns
        num_columns += 1
        coltrans[c] = (tabtrans[t], col_label, cell_label)
        real_columns.append(col_label)
        real_cells.append(cell_label)
        selectclause.append('%s.string_value as %s' % 
                            (cell_label, c))
    # Build virtualized whereclause & tablist
    whereclause=[]
    tablist = []
    for user_name, real_name in tabtrans.items():
        tablist.append('vtable %s' % real_name)
        whereclause.append("%s.tabname='%s'" % (real_name, user_name))
    last_cell_name = None
    for user_col_name, (real_table_name, real_col_name, real_cell_name) in coltrans.items():
        tablist.append('vcol %s' % real_col_name)
        tablist.append('vcell %s' % real_cell_name)
        whereclause.append("%s.vtable_id = %s.id" %
                           (real_col_name, real_table_name))
        whereclause.append("%s.colname='%s'" %
                           (real_col_name, user_col_name))
        whereclause.append("%s.vcol_id=%s.id" %
                           (real_cell_name, real_col_name))
        if last_cell_name is not None:
            whereclause.append("%s.vrow_id=%s.vrow_id" % 
                               (real_cell_name, last_cell_name))
        last_cell_name = real_cell_name
    sql = 'SELECT %s FROM %s WHERE %s' %           (','.join(selectclause),
           ','.join(tablist),
           ' AND '.join(whereclause))
    return sql

Thursday, February 02, 2006

Python Atlanta Meeting Website

Categories: python, meetup, turbogears
There was some discussion at the last Atlanta Python Meetup about building our own "meetup" style website using one of the Python web development frameworks. I have thrown together a "meetup"-style website for the Atlanta Python Meeting using TurboGears 0.8.8. You can see it at here. You can download the source here. And of course, you are invited to the meeting on February 9.

If anyone else is interested in maintaining / enhancing the code, let me know and I'll give you CVS & login access. If anyone is interested in using this website, well then, register. And if you just want me to add features, then email me or post the requests in the comments.

Wednesday, February 01, 2006

Python Web Frameworks

Categories: python, web, programming, framework
Well, the BDFL is looking for a web framework. Laying aside all the arguments for TurboGears, Django, web.py, Nevow, Cheetah, Paste, etc., I noticed that Glyph Lefkowitz and Ian Bicking both piped up with their input. Glyph is involved in Nevow, and Ian is involved in Paste.

Glyph makes the point that web frameworks come with a bit of cognitive overhead, but that this overhead is necessary. You really need to see the world from the perspective of the framework. Even if you write your own, you have to invent your own webgeist, and it's still work. I couldn't agree more, as I discuss in my previous entry.

Ian, on the other hand, is (it seems) the father of Paste, which seems to be a toolbox of sorts enabling the easy creation of web frameworks. Ah, so he's the one responsible for making it so easy to roll your own! Anyway, Paste does seem to take care of many of the details of writing a web framework. Ian has written a nice tutorial on framework construction. Not that I want you to go there - no - don't - write - your - own - oh well, I've lost you. Seriously, it looks pretty easy. It would be nice to see all the existing frameworks that aren't already based on paste to port to Paste, if only to remove the redundant code. Ian has built a good tool. It would be nice to see people using it more.

Anyway, that's my $.02. Looks like Glyph and I have a common philisophical POV - drink the kool-aid of your framework of choice and get on with development. Ian, while not advocating the proliferation of frameworks, certainly makes the proliferation simpler. And, of course, if you want to find the framework that's "right" for you, the comments on Guido's blog are a great place to start.

Friday, January 13, 2006

Three Reasons Why You Shouldn't Write Your Own Web Framework

Categories: python, web, programming, framework, turbogears, django, zope

Jeremy Jones has some thoughts about Django and Rails. He also asks the question: "...what if re-creating existing code sometimes isn't as evil as we've been taught?" Good question to ask; I'd respond in the comments if I had an O'Reilly account, but I don't, so here goes:

Motivating Story

Let's say you're a developer and you're looking to build a whiz-bang dynamic content website. Your weapon of choice is Python, so you start looking at the various Python web frameworks. After losing many hours and half your hair examining the 31 flavors (and growing!), you decide to look into one or two in detail. Say you look at Django and TurboGears. Django has about 60% of what you need, and TurboGears has about 75% of what you need. At which point you say "Python web frameworks all suck!", you write your own, and you port your site to it. (Not to name names, or anything... ;-) ) So you end up a) examining the frameworks, b) writing your own new framework, c) actually constructing your site in your new framework, and d) promoting your new framework to others.

Why is this a bad thing?

I've got three reasons that come to the top of my mind. But first, let me give you an alternative story. Rather than writing your own framework, you decided to add the 25% that TurboGears lacked. You either became a TurboGears developer with source tree access and updated the source directly or sent your patches to someone who would integrate them later. You then ported your site to the (possibly customized) TurboGears framework. So why was the first scenario (where you "rolled your own") bad?

1. You wasted your time

I'm not saying that you could have avoided all the work of developing your own framework by extending someone else's. I'm just saying you could have avoided some of it. There's some overhead in reading the other framework's code and grokking the design, as well as in actually modifying the framework by implementing the functionality you need. But I contend that it's a lot less work than you would spend inventing your own framework.

Then there's the problem of bugs. In the 75% of TurboGears that you re-implemented, you wrote bugs. You probably revised your initial design a few times. This is all work you could have avoided if you'd chosen to contribute to an existing project rather than rolling your own.

Then there's the work you did promoting your framework to others. If you'd enhanced TurboGears (or Django, this is just an example), you could ride the free wave of publicity that comes with a well-known project.

2. You wasted other people's time

This really comes from promoting your framework. If you want to keep everything to yourself, you won't waste anyone's time but your own. But when you make it available to others, you generate more work for the next developer trying to decide on a framework. (And because of programmer style, he's just about as likely to decide that your framework sucks as any other.) But he had to do the research. If you were on his "short list", he had to spend a decent amount of time weighing the benefits of using your framework. And if he decided not to go with you, that's time that is (mostly) wasted.

3. Your work has diminished impact

You did some clever things implementing your framework. You might have used a new abstraction that no one has considered before. Maybe you created a templating framework with features that 70% of developers want, and no other framework provides. You want to contribute back to the community. Well, by creating a new framework, the size of your community is a lot smaller than it could be. Contribute code to an existing framework, make it even better, and all its users benefit from your work. "Roll your own," and you'll end up pulling a few developers from using other frameworks to using yours, but you'll probably still have a smaller impact than if you'd contributed to an existing framework.

The dynamics of markets usually end up promoting only 1-3 really dominant players: "gorillas" and "chimps", to borrow from Geoffrey Moore. All the "monkeys" then end up competing for the scraps left by the others. This is true in "real" marketplaces, and it's true in the marketplace of ideas. Most people have some resistance to change, and a great many don't want to spend any more time than necessary making decisions. Where is all this going? Realistically, you're not too likely to become the gorilla if you roll your own framework. And even if you manage to, you've diminished the impact of the previous gorilla's contributions to the state of the art. In the FOSS world, gorillas are can actually be a net positive for everyone, and (I believe) should be encouraged. If OpenBSD, FreeBSD, and NetBSD were stronger competitors to Linux in Free kernels, FOSS kernels would run on a lot fewer machines, there would be less hardware support, and things would in general be a lot worse for everyone involved.

Why do people continue to re-invent the wheel?

So why do we have such a proliferation of web frameworks? Here are some guesses:

It's really easy to build a new framework - Titus Brown discusses this a bit. The "pain threshold" in rolling your own is so low that you might as well do it.

Writing code is more fun than reading code - This is one that bites me all the time. You convince yourself that you could write what you need faster than you could grok existing code and extend it. Maybe so. But if you're like me, you'd end up grossly underestimating the amount of time needed to test and debug your new design, not to mention the design changes you make as you develop it. (Maybe call that "grokking your own design" ;-) )

It's cool to have your own framework - OK, there are personal benefits to rolling your own. You can blog about it. Other people can blog about it. People get to know your name. Self-promotion may sound crass, but it's nice to be recognized. And it helps getting a job. For the benefit of the state of the art, though, I'd still rather you contributed to an existing codebase.

You want something with a fundamentally different architecture than anything that's out there - I think this is perhaps the first really valid reason to roll your own. Zope should not be the only framework to choose from. It works for a lot of people. But it has a baroque architecture compared to pretty much any other Python web framework. I'll go out on a limb and say that most people who think everything else is fundamentally wrong for their needs are just trying to justify a decision that was actually made because of one of the above three reasons. It's better to find a mainstream framework you can live with than invent a "perfect" framework that no one else can.

The existing project manager won't accept your help - I guess sometimes this happens. It seems to have happened with XFree86. OK, go ahead and roll your own. If the manager's truly impossible to deal with, you might even fork the project. But I think this, like the reason above, it a lot less likely than many think.

Am I a hypocrite?

In this post, yeah, a little bit. As more of a user of frameworks than an author, I haven't contributed code to anything (yet). I use a "chimp" language (Python) rather than a "gorilla" (Java, or Perl or PHP for dynamic languages). I've rolled a lot of my own code because I was too lazy to grok someone else's. Hopefully I'll get to the ideal in this post eventually. But I am trying to head in (what I think is) the right direction.

Very brief Python meetup report

Well, I was late to the meetup (as those who know me well might expect...). It was an interesting time. Might have just been me, but it seemed like people are still in the getting-to-know-you phase, which is fine with me, since this was my first time. What I did find interesting is that the topics of conversation were just about exclusively focused on web development (at least while I was there). I wonder whether that's a Python phenomenon or if all meetups for all "web-enabled" languages have the same characteristics. Not that I'm not interested in web development; I am. It's just not what I use Python for at my day job (building compilers). Of course, my suspicion is that there just aren't that many people building compilers in Python. At least not yet....

Thursday, January 12, 2006

A Customer!

I am happy to say that I signed up my first beta customer for ConsulTracker.com yesterday! It was an exhilirating and terrifying experience, as of course a customer finds bugs you'd never think to test for. This has given me a new drive to build an extensive test suite. So much better to fix the bugs before your customers find them!

Pages