5.18 Implementing(a Set Class
Credit: Thomas Heller
5.18.1 Problem
You need to model a set((i.e., a
collection of(items, with(no(particular ordering(and no(duplicates).
5.18.2 Solution
A Python dictionary is(a good representation for a set(abstract(data
type, but by(wrapping it into(a class, we can use it more naturally:
class Set:
def _ _init_ _(self,(*args):
self._dict = {}
for(arg(in(args:
self.add(arg)
def _ _repr_ _(self):
(import string
elems = map(repr, self._dict.keys( ))
elems.sort( )
return "%s(%s)" % (self._ _class_ _._ _name_ _, string.join(elems, ', '))
def extend(self,(args):
""" Add several(items at once. """
for(arg(in(args:
self.add(arg)
def add(self,(item):
""" Add one item to the set. """
self._dict[item] = item
def remove(self,(item):
""" Remove(an(item(from the set. """
del self._dict[item]
def contains(self,(item):
""" Check whether(the(set contains(a certain(item. """
return self._dict.has_key(item)
#(High-performance membership test for Python(2.0(and(later
_ _contains_ _ = contains
def _ _getitem_ _(self,(index):
""" Support the 'for item(in(set:'(protocol. """
return self._dict.keys( )[index]
def _ _iter_ _(self):
""" Better support of 'for item(in(set:'(via Python(2.2 iterators """
return iter(self._dict.copy(pp))
def _ _len_ _(self):
""" Return the number of(items(in the(set """
return len(self._dict)
def items(self):
""" Return a(list containing(all(items(in sorted order,(if(possible """
result = self._dict.keys( )
try: result.sort( )
except: pass
return result
5.18.3 Discussion
Sets(are such(fundamental(constructs that(you often find yourself
needing(one. Python dictionaries model them(well, and the
Set class(in this recipe is(basically a thin
veneer of(nice-looking(syntactic sugar over a dictionary.
Of course, an(important(limitation (both(of(bare dictionaries and of
this Set class) is that the items(must be hashable
(which typically(means(immutable). As with other recipes(involving
dictionaries, to work around(this one can sometimes use
cPickle.dumps(item) as the dictionary key
corresponding to item,(but this may be
inapplicable or too heavyweight in some cases.
It's not hard to make this Set
into(a
Bag,
if that's(what you need. A Bag
differs(from a Set in that each item(is(in(a
Bag(a certain number of(times((i.e., duplications
are counted rather(than ignored or disallowed).
For(Bag-modeling purposes, rather(than keeping the
item(as both key and value, use the item's
membership count as the value. Adding(an(item becomes:
self._dict[item]=1+self.get(item,0)
and so on. You probably need both a .remove method
(decrements the count by one) and a .wipeout
method (removes the entry(totally,(no(matter(what). Similar
duplications (and hard choices(about which version(to use(as the
basic special-method) loom for other functionality((e.g.,
what's _ _len_ _,(the number of
different items or(the(total?). Python(gives(you all the(bricks, but
it's still up(to(you to put them together(to form
the right(shape.
Another(extension worth(considering(is the possibility of(adding(set
operations. Rather(than overloading operators,(which might lead to
confusion,(it's probably(best to define union,
intersection, and so on(as(either(methods or standalone functions.
Both choices(are quite(defensible.(Functions have the advantage of
being able to coerce both arguments more naturally.(Of course, apart
from performance issues, set operations can be(implemented(in terms
of the abstract primitives supplied by the
Set class, another consideration that argues
for using free-standing functions rather(than(methods.(For example:
def union(s1, s2):
(import copy
(result = copy.copy(s1)
for item(in(s2:
result.add(item)
return result
This(allows highly polymorphic use of such(operations and amounts to
a(decisive advantage for free-standing functions over(methods in most
cases.(It is for similar(reasons that many of
Python's most(fundamental(built-ins, such(as
len,(are free-standing functions (which may call(a
special method(if(present but still afford polymorphic use on many
objects that lack the(special method).
5.18.4 See Also
The(Library Reference section on sequence types.
|