Friday, January 13, 2006

Three Reasons Why You Shouldn't Write Your Own Web Framework

Categories: , , , , , ,

Jeremy Jones has some thoughts about Django and Rails. He also asks the question: "...what if re-creating existing code sometimes isn't as evil as we've been taught?" Good question to ask; I'd respond in the comments if I had an O'Reilly account, but I don't, so here goes:

Motivating Story


Let's say you're a developer and you're looking to build a whiz-bang dynamic content website. Your weapon of choice is Python, so you start looking at the various Python web frameworks. After losing many hours and half your hair examining the 31 flavors (and growing!), you decide to look into one or two in detail. Say you look at Django and TurboGears. Django has about 60% of what you need, and TurboGears has about 75% of what you need. At which point you say "Python web frameworks all suck!", you write your own, and you port your site to it. (Not to name names, or anything... ;-) ) So you end up a) examining the frameworks, b) writing your own new framework, c) actually constructing your site in your new framework, and d) promoting your new framework to others.

Why is this a bad thing?


I've got three reasons that come to the top of my mind. But first, let me give you an alternative story. Rather than writing your own framework, you decided to add the 25% that TurboGears lacked. You either became a TurboGears developer with source tree access and updated the source directly or sent your patches to someone who would integrate them later. You then ported your site to the (possibly customized) TurboGears framework. So why was the first scenario (where you "rolled your own") bad?

1. You wasted your time


I'm not saying that you could have avoided all the work of developing your own framework by extending someone else's. I'm just saying you could have avoided some of it. There's some overhead in reading the other framework's code and grokking the design, as well as in actually modifying the framework by implementing the functionality you need. But I contend that it's a lot less work than you would spend inventing your own framework.

Then there's the problem of bugs. In the 75% of TurboGears that you re-implemented, you wrote bugs. You probably revised your initial design a few times. This is all work you could have avoided if you'd chosen to contribute to an existing project rather than rolling your own.

Then there's the work you did promoting your framework to others. If you'd enhanced TurboGears (or Django, this is just an example), you could ride the free wave of publicity that comes with a well-known project.

2. You wasted other people's time


This really comes from promoting your framework. If you want to keep everything to yourself, you won't waste anyone's time but your own. But when you make it available to others, you generate more work for the next developer trying to decide on a framework. (And because of programmer style, he's just about as likely to decide that your framework sucks as any other.) But he had to do the research. If you were on his "short list", he had to spend a decent amount of time weighing the benefits of using your framework. And if he decided not to go with you, that's time that is (mostly) wasted.

3. Your work has diminished impact


You did some clever things implementing your framework. You might have used a new abstraction that no one has considered before. Maybe you created a templating framework with features that 70% of developers want, and no other framework provides. You want to contribute back to the community. Well, by creating a new framework, the size of your community is a lot smaller than it could be. Contribute code to an existing framework, make it even better, and all its users benefit from your work. "Roll your own," and you'll end up pulling a few developers from using other frameworks to using yours, but you'll probably still have a smaller impact than if you'd contributed to an existing framework.

The dynamics of markets usually end up promoting only 1-3 really dominant players: "gorillas" and "chimps", to borrow from Geoffrey Moore. All the "monkeys" then end up competing for the scraps left by the others. This is true in "real" marketplaces, and it's true in the marketplace of ideas. Most people have some resistance to change, and a great many don't want to spend any more time than necessary making decisions. Where is all this going? Realistically, you're not too likely to become the gorilla if you roll your own framework. And even if you manage to, you've diminished the impact of the previous gorilla's contributions to the state of the art. In the FOSS world, gorillas are can actually be a net positive for everyone, and (I believe) should be encouraged. If OpenBSD, FreeBSD, and NetBSD were stronger competitors to Linux in Free kernels, FOSS kernels would run on a lot fewer machines, there would be less hardware support, and things would in general be a lot worse for everyone involved.

Why do people continue to re-invent the wheel?


So why do we have such a proliferation of web frameworks? Here are some guesses:

  • It's really easy to build a new framework - Titus Brown discusses this a bit. The "pain threshold" in rolling your own is so low that you might as well do it.

  • Writing code is more fun than reading code - This is one that bites me all the time. You convince yourself that you could write what you need faster than you could grok existing code and extend it. Maybe so. But if you're like me, you'd end up grossly underestimating the amount of time needed to test and debug your new design, not to mention the design changes you make as you develop it. (Maybe call that "grokking your own design" ;-) )

  • It's cool to have your own framework - OK, there are personal benefits to rolling your own. You can blog about it. Other people can blog about it. People get to know your name. Self-promotion may sound crass, but it's nice to be recognized. And it helps getting a job. For the benefit of the state of the art, though, I'd still rather you contributed to an existing codebase.

  • You want something with a fundamentally different architecture than anything that's out there - I think this is perhaps the first really valid reason to roll your own. Zope should not be the only framework to choose from. It works for a lot of people. But it has a baroque architecture compared to pretty much any other Python web framework. I'll go out on a limb and say that most people who think everything else is fundamentally wrong for their needs are just trying to justify a decision that was actually made because of one of the above three reasons. It's better to find a mainstream framework you can live with than invent a "perfect" framework that no one else can.

  • The existing project manager won't accept your help - I guess sometimes this happens. It seems to have happened with XFree86. OK, go ahead and roll your own. If the manager's truly impossible to deal with, you might even fork the project. But I think this, like the reason above, it a lot less likely than many think.

Am I a hypocrite?


In this post, yeah, a little bit. As more of a user of frameworks than an author, I haven't contributed code to anything (yet). I use a "chimp" language (Python) rather than a "gorilla" (Java, or Perl or PHP for dynamic languages). I've rolled a lot of my own code because I was too lazy to grok someone else's. Hopefully I'll get to the ideal in this post eventually. But I am trying to head in (what I think is) the right direction.

16 comments:

  1. Well the other alternative is that you adjust your requirements to suit the most appropriate framework. Are all your requirements REALLY "requirements"? Perhaps it's sometimes more of a case of you're in the mindset to write your own framework, and then you start looking to see what's around?

    Do you think up in your head what you want your car to look like before you go car shopping? And then when you find the car closest to the model in your head, you purchase it and then modify it? Of course it's easier to modify a framework than a vehicle, but sometimes you just gotta make do with what you've got - and usually, once you change your paradigm, the product does indeed meet your true requirements :)

    ReplyDelete
  2. I absolutely agree. It's kind of like with compilers; when I was a beginning programmer, I was always sure that the compiler was buggy. After all, the program kept crashing. Slowly and painfully I learned that the compiler's usually right, and 99% of the time it was me screwing something up.

    I guess framework design can be the same way. Just because you're opinionated doesn't mean you're right. People don't intentionally build bad frameworks, so when you find one that looks bad, you should at least consider the possibility that you just don't understand it yet, and put in the time to really grok the design. It'll take less time than designing a new framework, and you'll be a better programmer for it.

    ReplyDelete
  3. Anonymous2:17 PM

    Yeah, let's all not experiment with creating our own frameworks so that we'll always be stuck with the ones we have now.

    ReplyDelete
  4. Anonymous2:19 PM

    Also, let's not write about technical stuff like frameworks any more because it may waste other people's time. This whole internet thing should be reconsidered.

    ReplyDelete
  5. the "python web framework problem" has proceeded nicely; while lots were complaining about "too many web frameworks" for the past two or three years, the diversity among frameworks have allowed terrific improvements in techniques and approaches. Now that we have an idea of how to do things, the evidence at Pycon illustrates that the field is now narrowing significantly to embrace the best techniques that have emerged, to four maybe five main choices. these frameworks have all gained significantly from the existence of each other as well as frameworks which preceded them.

    i dont see the "new framework every week" thing going on these days, Python is now past that stage. you now see frameworks starting to merge; Myghty has deprecated in favor of its successor Pylons; Spyce's own maintainer admits he'd rather use Django, as soon as they fix up their ORM (which will eventually be sqlalchemy); and Turbogears and Pylons are openly talking about a future merge.

    so while it was tough for those overwhelmed by "too many choices", i think its clear that its far easier to choose now than it was say a year ago...it was a growing stage and Python is much better off for it.

    ReplyDelete
  6. Re: anonymous

    I heard a great comment at PyCon whose wording I will now butcher: "You should definitely write your own framework if your goal is to learn how to write web frameworks." My point in this post was not so much to discourage others from learning how to write frameworks, but rather to encourage them to contribute to the existing frameworks we have to make them even better. We don't get "stuck with the ones we have now" if we take this approach; we improve the ones we have now.

    This post was also written in a time of frustration as I was considering which framework I should try using to create web apps. TurboGears and Django were getting some press, I had tried Zope a bit (not really my cup of tea), and web.py had just been released. My fear, which may have been baseless, was that the lack of 1-3 dominant players would discourage users and contributors, and prevent any of the existing frameworks from getting really good. (And the trend seemed to be accelerating at the time, although it seems to have settled down a bit to Django, Zope, TurboGears, and Pylons for the most part, with TurboGears and Pylons sharing so much philosophy and code that they're kind of like two views into the same project. So there's my 3 players, kind of.)

    Ah, I just saw zzzeek's comment and couldn't agree more (and this post was written more than a year ago. So I'll just end this remark here. ;-)

    ReplyDelete
  7. Three reasons to write your own:

    1. Code styleguide
    If you have a large codebase of existing code it is far easier in terms of mantainence if all code is written in the same style (indention, comments, etc) - it is really hard to auto-generate documentation if a sub-system like a framework for a website does not work with your docu generator.

    2. Licence issues
    Sadly many open source and commercial software comes with confusing, unclear and sometimes plain stupid licence models. Getting approval for a proprietary licence or even an established licence like (L)GPL is a complex process in a bigger company. This is a homegrown problem of many companies (and mostly based on fear and not being informed), but the fact persists: sometimes you save 2 days with an external lib and spent a week in conf calls and emails to get approval for the licence.

    3. Future development
    If you write your own code you have the full control over the future development for the codebase. What if the framework/lib you are using does not get mantained anymore (yeah, open source and all - but if the "core developers" quit most mid-sized open source projects are effictively dead)? What about system requirements? Ever been in a situatuion where one lib is incompatible with a changed system requirements of another? Really nasty. Same about update cylces, security issues (you shouldn't rely on security-by-obscurity, but having your own code nobody knows is a plus), etc...

    Does all this involve releasing the framework to the public? Well, maybe you do - maybe not. But i would always write my own if I intend to mantain a codebase for a couple years. External dependencies suck badly. Admitted, personal - maybe stupid - oppinion. But worked pretty good for me the last 10 years I write (and mantain) code.

    ReplyDelete
  8. Someone at Reddit had the correct quote from PyCon that I butchered above, so I thought I'd copy it here:

    James Tauber had a great line at the PyCon web frameworks panel this year: "reinventing the wheel is great if your goal is to learn more about wheels."

    So there. Funnier and much more succinct than my foggy memory of the event.

    and @mikx:

    All of your reasons could apply to any code or library you might reuse, and seem symptomatic of NIH (not invented here) syndrome. There are certainly some times when it may be impossible to reuse code due to business constraints etc., but (IMO) code reuse is usually the way to go.

    ReplyDelete
  9. @rick
    Yes, those rules apply to any kind of code. Might be true our team is a little infected by the NIH syndrome - I am open to admit that ;)

    ReplyDelete
  10. Another reason to write your own framework:

    If you approach is different/unique enough from one of the existing frameworks, you can expect a huge battle from the existing devs on the framework in getting your changes in.

    This has happened numerous times in the past where a benevolent (or not) dictator leader of an open source project will deride, mock, and completely oppose an approach that goes against "the philosophy" of his own project. Sometimes this is for legitimate reasons; other times it's just turf protection pure and simple.

    You can waste months of time, learning the other project, coding up your solution, integrating it with their own weird internal infrastructure components, and evangelize, and have it all be a waste of time.


    I'm currently working on (gasp!) yet another object relational persistence framework in Java. Sounds like the stupidest thing ever, right? It's not:

    1) To most people, persistence framework == database persistence framework. Choose to use a more high-performance type of storage engine (e.g. Berkeley DB)? Your choices are quickly very limited, with some of the most popular, industry-standard choices (e.g., Hibernate) instantly eliminated.

    2) Many ORM framework require use of a query language, which adds parsing overhead. Want to use a faster, API-based query? Again, your options are precious few.

    In fact, at this point, I think the only option is Amazon's Carbonado.

    OK, now say you're considering moving to something even more performant in the future, e.g., some sort of distributed data storage system. Well, then, you'd have to write an add-on to, say, Carbonado to let it interact with this new storage mechanism. That means learning their framework and trying to figure out what they did and how to do it - a tall order. Why invest time learning something like that - with no guarantee of your work getting accepted - when you can just write your own framework?

    Plus, frankly, it's not THAT hard, and doesn't take THAT long to write something like this. Certainly much less time than it would take to try to convince, say, the Hibernate community to adopt a radical new approach like this!

    Yes, if an existing framework really meets your needs, then by all means use it. (And I do that often.) But if no one is meeting your need, then meet your own. If you've done it well, you might be surprised to see others adopt your work too.

    ReplyDelete
  11. Anonymous1:17 AM

    This is really quite a pessimistic outlook. Anyone could just as easily contribute more to other projects through what was learned by their own exploration.

    I think it's especially important to write your own code in some areas, because understanding what you're using is what leads to that big idea that WILL provide something innovative and new to the Python community, whether or not it's through an existing project.

    After all, what happens when you put 50 chimps up against a gorilla?

    ;)

    ReplyDelete
  12. I see your point about your own exploration. Maybe the title of my post should have been "Three Reasons Why You Shouldn't Publish Your Own Web Framework and Recruit Others to Use/Develop It."

    As a side note, things really have changed in the Python web framework world since I wrote this article. It seems that the "hot" frameworks now are Django, Pylons, and TurboGears (in that order), and Pylons and TurboGears share huge swaths of code, so you can almost consider them two sides of the same coin. The advent of WSGI has blessed Python with interoperability between various libraries and frameworks as well. So I probably wouldn't write this article today; I'd probably write something on how you should create components and libraries that can be reused in a variety of frameworks.

    ReplyDelete
  13. Another reason to roll your own framework would be that you had one *before* the others got started. That's the case for CubicWeb, a semantic web application framework that has been developed since 2000 and has had for years features that will be very difficult to add to Django, TG and the like because of their design: a SPARQL-like query language and multi-database support are obvious examples.

    ReplyDelete
  14. @NicolasChauvat: Of course, if your framework predated TG, Django, Pylons, etc., that's a very good reason to roll your own.

    ReplyDelete
  15. Before advising hackers to contribute to existing projects, you ought to try it yourself! :)

    Contrary to popular belief, it's not simply a matter of sending your patch to the project's mailing list and then kicking up your heels as untold thousands of anonymous users enjoy the fruits of your labor. Patches that introduce new features can generate debate that rages for months, and even the simplest bug-fix patches can generate considerable chatter.

    The great irony of open source software development is that many (I'd wager most) patches are not accepted. Once an open-source project is established, it develops an inertia that resists change. Its maintainers select patches for stability and compatibility over functionality. Even if they agree that your new feature is worthwhile, if it breaks compatibility they'll queue it up for the next major release -- which, for some projects, can be years away.

    Also, you shouldn't discount the competitive nature of open-source software development. Without competition, there's little incentive to innovate. In what other industry would we welcome less competition?

    While I don't agree with the reasons you mentioned in your post, there is one good reason not to write your own web framework: you should be writing something else instead.

    In this post, Bruce Eckel discusses how the Not Invented Here (NIH) syndrome can be an obstacle to getting actual work done.

    ReplyDelete
  16. @claymation: Thanks for the comment! I actually have contributed to a couple of projects (though not always through the mailing list), and it definitely can be a daunting task. It's probably worth it most of the time, but I can definitely see the usefulness of forking -- sometimes ;-) .

    ReplyDelete