Generic state machine class using iterators






2006 Guenter Milde. Released under the terms of the GNU General Public License (v. 2 or later)

"""Simple generic state machine class using iterators


Example: A two-state machine sorting numbers in the categories
         "< 3" and ">= 3".


Import the basic class::

>>> from simplestates import SimpleStates

Subclass and add state handlers:

>>> class StateExample(SimpleStates):
...    def high_handler_generator(self):
...        result = []
...        for token in self.data_iterator:
...            if token <= 3:
...                self.state = "low"
...                yield result
...                result = []
...            else:
...                result.append(token)
...        yield result
...    def low_handler_generator(self):
...        result = []
...        for token in self.data_iterator:
...            if token > 3:
...                self.state = "high"
...                yield result
...                result = []
...            else:
...                result.append(token)
...        yield result

Set up an instance of the StateExample machine with some test data::

>>> testdata = [1, 2, 3, 4, 5, 4, 3, 2, 1]
>>> testmachine = StateExample(testdata, state="low")

>>> print [name for name in dir(testmachine) if name.endswith("generator")]
['high_handler_generator', 'low_handler_generator']


Iterating over the state machine yields the results of state processing::

>>> for result in testmachine:
...     print result,
[1, 2, 3] [5, 4] [2, 1]

For a correct working sort algorithm, we would expect::

  [1, 2, 3] [4, 5, 4] [3, 2, 1]

However, to achieve this a backtracking algorithm is needed. See
and for an example.

The `__call__` method returns a list of results. It is used if you call
an instance of the class::

>>> testmachine()
[[1, 2, 3], [5, 4], [2, 1]]


Abstract State Machine Class

class SimpleStates:
    """generic state machine acting on iterable data

    Class attributes:

      state -- name of the current state (next state_handler method called)
      state_handler_generator_suffix -- common suffix of generator functions
                                        returning a state-handler iterator
    state = 'start'
    state_handler_generator_suffix = "_handler_generator"


  • sets the data object to the data argument.

  • remaining keyword arguments are stored as class attributes (or methods, if they are function objects) overwriting class defaults (a neat little trick I found somewhere on the net)

    ..note: This is the same as self.__dict__.update(keyw). However,

    the “Tutorial” advises to confine the direct use of __dict__ to post-mortem analysis or the like…

def __init__(self, data, **keyw):
    """data   --  iterable data object
                  (list, file, generator, string, ...)
       **keyw --  all remaining keyword arguments are
                  stored as class attributes
    """ = data
    for (key, value) in keyw.iteritems():
        setattr(self, key, value)

Iteration over class instances

The special __iter__ method returns an iterator. This allows to use a class instance directly in an iteration loop. We define it as is a generator method that sets the initial state and then iterates over the data calling the state methods:

def __iter__(self):
    """Generate and return an iterator

    * ensure `data` is an iterator
    * convert the state generators into iterators
    * (re) set the state attribute to the initial state
    * pass control to the active states state_handler
      which should call and process
    self.data_iterator = iter(
    # now start the iteration
    while True:
        yield getattr(self, self.state)()

a helper function generates state handlers from generators. It is called by the __iter__ method above:

def _initialize_state_generators(self):
    """Generic function to initialise state handlers from generators

    functions whose name matches `[^_]<state>_handler_generator` will
    be converted to iterators and their `.next()` method stored as
    suffix = self.state_handler_generator_suffix
    shg_names = [name for name in dir(self)
                  if name.endswith(suffix)
                  and not name.startswith("_")]
    for name in shg_names:
        shg = getattr(self, name)
        setattr(self, name[:-len(suffix)], shg().next)

Use instances like functions

To allow use of class instances as callable objects, we add a __call__ method:

def __call__(self):
    """Iterate over state-machine and return results as a list"""
    return [token for token in self]

Command line usage

running this script does a doctest:

if __name__ == "__main__":
