Walking Python objects recursively

2011/12/11 § 6 Comments

Here’s a small function that walks over any* Python object and yields the objects contained within (if any) along with the path to reach them. I wrote it and am using it to validate a deserialized datastructure, but you can probably use it for many things. In fact, I’m rather surprised I didn’t find something like this on the web already, and perhaps it should go in itertools.

Edit: Since the original post I added infinite recursion protection following Eli and Greg’s good advice, added Python 3 compatibility and did some refactoring (which means I had to add proper unit test). You will always be able to get the latest version here, on ActiveState’s Python Cookbook (at least until it makes its way into stdlib, fingers crossed…).

from collections import Mapping, Set, Sequence

# dual python 2/3 compatability, inspired by the "six" library
string_types = (str, unicode) if str is bytes else (str, bytes)
iteritems = lambda mapping: getattr(mapping, 'iteritems', mapping.items)()

def objwalk(obj, path=(), memo=None):
    if memo is None:
        memo = set()
    iterator = None
    if isinstance(obj, Mapping):
        iterator = iteritems
    elif isinstance(obj, (Sequence, Set)) and not isinstance(obj, string_types):
        iterator = enumerate
    if iterator: 
        if id(obj) not in memo:
            for path_component, value in iterator(obj):
                for result in objwalk(value, path + (path_component,), memo):
                    yield result
        yield path, obj

And here’s a little bit of sample usage:

>>> tuple(objwalk(True))
(((), True),)
>>> tuple(objwalk({}))
>>> tuple(objwalk([1,2,3]))
(((0,), 1), ((1,), 2), ((2,), 3))
>>> tuple(objwalk({"http": {"port": 80, "interface": ""}}))
((('http', 'interface'), ''), (('http', 'port'), 80))

"any" is a strong word and Python is flexible language; I wrote this function to work with container objects that respect the ABCs in the collections module, which mostly cover the usual builtin types and their subclasses. If there’s something significant I missed, I’d be happy to hear about it.

Tagged: , , , , ,

§ 6 Responses to Walking Python objects recursively

  • Eli says:

    How about adding an “infinite recursion” protection mechanism for cyclic data-structures? Shouldn’t be too hard, by keeping a dict of object IDs.

  • Greg says:

    recursion protection might be a bit over-zealous. Shouldn’t you remove from the memo after recursing, else this non-recursive structure isn’t fully described:

    b={0:a, 1:a}
    tuple=objwalk(b) ?

    • Yaniv Aknin says:

      Hmmm, well, I definitely didn’t think about it, although you could say that perhaps it’s not desired to visit a twice in such a case. Regardless, I think I’d have been least-astonished with the behaviour you’re describing, so I’ll change the code. Thanks!

  • Ionel M says:

    It’s best to have interface checks (eg: hasattr(obj, ‘__iter__’) ) instead of isinstance checks (for portability).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

What’s this?

You are currently reading Walking Python objects recursively at NIL: .to write(1) ~ help:about.


%d bloggers like this: