How to make glom work with dict-like objects

898 Views Asked by At

Glom (https://glom.readthedocs.io/en/latest/) is, amongst other things, for

path-based access for nested structures

But how do you make it work for nested structures beyond dicts?

Consider the following class (purposely not an actual collection.abc.Mapping, for simplicity:

class MyMap: 
    def __init__(self, d):
        self.d = d
    def __getitem__(self, k):
        """just delegating"""
        v = self.d[k]
        if isinstance(v, (dict, MyMap)):
            return MyMap(v)
        else:
            return v

This works:

m = MyMap({'a': {'b': {'c': 'd'}}})
assert m['a']['b']['c'] == 'd'

But this doesn't:

from glom import glom
assert glom(m, 'a.b.c') == 'd'

I get the error: PathAccessError: could not access 'a', part 0 of Path('a', 'b', 'c'), got error: AttributeError("'MyMap' object has no attribute 'a'")

More specifically, how does one specify:

  • what's a node (i.e. an object that can be glommed further)
  • a key iterator (how to split a path into keys)
  • an item getter (how data is retrieved from a key)

In case it helps, here's the kind of function I'm looking for glom to satisfy:

dot_str_key_iterator = lambda p: p.split('.')
bracket_getter = lambda obj, k: obj[k]

def simple_glom(target, spec, 
                node_types=(dict,), 
                key_iterator=dot_str_key_iterator,
                item_getter=bracket_getter
               ):
    for k in key_iterator(spec):
        target = item_getter(target, k)
        if not isinstance(target, node_types):
            break
    return target

This function doesn't have all the bells and whistles, but allows me to do:

m = MyMap({'a': {'b': {'c': 'd'}}})
simple_glom(m, 'a.b.c', node_types=(MyMap,))

Or for an extreme example using all parametrizatons:

from types import FunctionType
from functools import partial

attr_glom = partial(simple_glom, 
                    node_types=(FunctionType, type), 
                    key_iterator=lambda p: p.split('/'), 
                    item_getter=getattr)
assert attr_glom(MyMap, '__getitem__/__doc__') == 'just delegating'
1

There are 1 best solutions below

3
On BEST ANSWER

These are excellent questions! I'll do my best to break them out one at a time:

How to specify that an object can be glommed?

When you say "glom" here, I assume you mean "access".

Python features a very rich data model, and while there's usually an intuitive meaning for "access" in every context, it can be expensive and risky to guess. glom's approach to this is to provide explicit registration APIs.

How to specify a key iterator?

If you want to split a path (e.g., the 'a.b.c' of glom(target, 'a.b.c')) into the structured keys (and operations) used to access that path, I would recommend looking at the Path type, specifically, Path.from_text() and path.items().

If you'd like to look at an object and ascertain which paths exist within the object, that's trickier.

At the time of writing, glom is still primarily designed for accessing and building known structures, with some affordances for defaults/branching. For full context-free iteration of all possible paths, the best I have to offer at the moment is remap (cookbook). Note that remap doesn't have glom's plugin/registration power, but it does work out-of-the-box for iterating glommable paths in common Python datatypes, as this example shows (the corresponding glom() call is done just below).

How to specify how data is retrieved from a key?

I think this is actually addressed in the registration APIs I referenced above. The get keyword argument refers to a built-in operation, which must be specified for all types.

On the off chance your definition of "retrieve" conflicts with the built-in get registration, you can override it, or you can specify a new operation that can be registered against using register_op, (e.g., how the assign operation was added).

Again, great questions, you're really leveraging Python's power here. Hope this helps!