Friday, August 22, 2008

Lazy Descriptors

Today I had a need to create a property on an object "lazily." The Python builtin property does a great job of this, but it calls the getter function every time you access the property. Here is how I ended up solving the problem:

First of all, I had (almost) the behavior I wanted by using the following pattern:


class Foo(object):
def __init__(self):
self._bar = None
@property
def bar(self):
if self._bar is None:
print 'Calculating self._bar'
self._bar = 42
return self._bar

There are a couple of problems with this, however. First of all, I'm polluting my object's namespace with a _bar attribute that I don't want. Secondly, I'm using this pattern all over my codebase, and it's quite an eyesore.

Both problems can be fixed by using a descriptor. Basically, a descriptor is an object with a __get__ method which is called when the descriptor is accessed as a property of a class. The descriptor I created is below:

class LazyProperty(object):

def __init__(self, func):
self._func = func
self.__name__ = func.__name__
self.__doc__ = func.__doc__

def __get__(self, obj, klass=None):
if obj is None: return None
result = obj.__dict__[self.__name__] = self._func(obj)
return result

The descriptor is designed to be used as a decorator, and will save the decorated function and its name. When the descriptor is accessed, it will calculate the value by calling the function and save the calculated value back to the object's dict. Saving back to the object's dict has the additional benefit of preventing the descriptor from being called the next time the property is accessed. So I can now use it in the class above:

class Foo(object):
@LazyProperty
def bar(self):
print 'Calculating self._bar'
return 42

So I get a nice lazily calculated property that doesn't recalculate bar every time it's accessed and doesn't bother with any memoization itself. What do you think about it? Is this a patten you use in your code?


8 comments:

Anonymous said...

I'm too lazy to try to understand how this thing works, what it does, and ramifications... but your naming it "descriptor", uh, woot?

http://foldoc.org/foldoc.cgi?query=descriptor

And isn't a function called _every_ time anyway?

Sri said...

Mate,

Great post. I wish you had posted this a month earlier :D. I had stumbled upon this in django (in the django.contrib.auth.middleware.py). It does just the thing to obtain access to the user object on a request.

But keep em coming mate, keep em coming.

Cheers
Sri

Doug Hellmann said...

That looks like a good way to defer object retrieval from a database.

Rick Copeland said...

@anonymous:
Actually, I didn't invent the name "descriptor" -- it's Python terminology documented here. I probably should have given more context in the post. As for a function being called every time, yes, that still happens, but the (potentially expensive) calculation of the value of bar only happens once.

Rick Copeland said...

@doug:
Yup, you guessed it. In particular, I wanted an ORM-mapped object as an attribute on a TurboGears controller.

RonnyPfannschmidt said...

you don't need to check if the attribute slot is on the instance

non-data descriptors get overridden by instance attributes

also please copy the name to __name__ and the documentation to __doc__

so its more easy to use with the common introspection tools

Rick Copeland said...

@ronny:

Thanks for the pointers. I have updated the post to reflect your changes.

@anonymous:
With the changes suggested by ronny, a function is no longer called on each attribute access (technically, it wasn't being called before, I just had an extraneous check of the object's __dict__).

Adrienne said...

I'm really excited to use this. Maybe sometime this week in Algebra or Geometry class. ;)