Python provides a simple way to implement behaviour to builtin methods for custom classes. You just implement some of the various magic methods and you're good to go.
Here is an example on how to leverage these functionality to build a custom sequence.
import os class DirectoryIndex(object): def __init__(self, path): self.items = os.listdir(path)
Of course, in a real application you wouldn't wrap a simple
os.listdir in it's own class and you wouldn't do the feching in the
__init__ method. Your business object may do a lot of more work like fetching data from external ressources, etc.
You can use an object of the class by explicit using the
di = DirectoryIndex('/etc') print "%s: %s" % (len(di.items), di.items)
Since every caller needs to know that we store our result in the
items field (which isn't pythonic and clearly violates the DRY principle, we have no control over who use it and how do they use it. Thus, we cannot optimize the access to the underlying ressource which might be costly to access. Furthermore, the caller can (and someone will) modifiy the
items field which makes the value of the list unreliable even for the containing
Here is a better version:
class DirectoryIndex(object): def __init__(self, path): self._path = path self._items = os.listdir(path) def __len__(self): return len(self._items) def __iter__(self): return iter(self._items) def __repr__(self): return "<DirectoryIndex of %s>" % self._path di = DirectoryIndex('/etc') print di print len(di) <DirectoryIndex of /etc> 195
In this version, we implemented two methods that makes the class actually useful:
__iter__. The former returns the length of the sequence, the latter provides an iterator which is used in iterating constructs like
for ... in. Both methods can contain arbitary code required to return a meaningful value, except that
__iter__ must return an iterator (so we cannot just return
self.items but have to wrap it in an iter). For example, in implementation of a custom sequence backed by a database may issue a
SELECT COUNT(*) FROM ... WHERE ... sql statement to return a value for
But we can improve this class even further. By providing an implementation for
__iter__, we get many built-in features for free:
di = DirectoryIndex('/etc') print 'sudoers' in di True print 'foobar' in di False di = DirectoryIndex('/tmp/emptydir') if di: print True else: print False False
However, the default implementation for the other methods of sequences (like the
__bool__ method or
__contains__ are using the
__iter__ method to create the return value, which causes a evaluation of the whole iterator. For example, a
'sudoers' in di causes a looping through
__iter__ to tell whether the value is in the list or not. But we can implement a few more methods to make that better:
class DirectoryIndex(object): # ... other methods ... def __contains__(self, item): return os.path.exists(os.path.join(self._path, item)) def __getitem__(self, key): return self._items[key.start:key.stop:key.step]
__contains__ method uses a shortcut to test if a file exists (and avoids evaluating the whole iterator). The
__getitem__ method is used by the slicing operator (sth. like
key parameter is an instance of
slice which contains properties for
step. In our example, we simply pass these values to our in-memory list but again you are free to use these values to return only a subset of the items from your backend (e.g. via
SELECT * FROM ... LIMIT x OFFSET y).
Python provides a simple yet powerful way to simply turn your classes into sequences which can be used wherever an
iterable object needs to be passed around. There are a few other methods (see emulating container types in the python documentation) which can be implemented to make some operations more efficient.