find parent for a json object using jsonpath-ng

1.3k Views Asked by At

I have a json obj like below. I need to find all the 'Statement's and respective parent. I was able to find the 'Statement's. But, do not know how to extract their respective parent. I am using Python3, jsonpath_ng. Here is my code so far:

from jsonpath_ng import jsonpath, parse
from pprint import pprint

       x = {'MasterRole': {'Type': 'AWS::IAM::Role',
          'Properties': {'Path': '/',
           'AssumeRolePolicyDocument': {'Version': '2012-10-17',
            'Statement': [{'Action': ['sts:AssumeRole'],
              'Principal': {'Service': ['ec2.amazonaws.com']},
              'Effect': 'Allow'}]},
           'Policies': [{'PolicyDocument': {'Version': '2012-10-17',
              'Statement': [{'Action': ['s3:AbortMultipartUpload',
                 's3:DeleteObject',
                 's3:PutObjectAcl'],
                'Resource': [{'Fn::Join': ['',
                   ['arn:aws:s3:::', {'Ref': 'ExhibitorS3Bucket'}, '/*']]},
                 {'Fn::Join': ['', ['arn:aws:s3:::', {'Ref': 'ExhibitorS3Bucket'}]]}],
                'Effect': 'Allow'},
               {'Action': ['cloudformation:*'],
                'Resource': [{'Ref': 'AWS::StackId'},
                 {'Fn::Join': ['', [{'Ref': 'AWS::StackId'}, '/*']]}],
                'Effect': 'Allow'}]}}]}}}

In [40]: res = [match.value for match in parse('$..Statement').find(x)]

In [41]: pprint(res)
[[{'Action': ['sts:AssumeRole'],
   'Effect': 'Allow',
   'Principal': {'Service': ['ec2.amazonaws.com']}}],
 [{'Action': ['s3:AbortMultipartUpload', 's3:DeleteObject', 's3:PutObjectAcl'],
   'Effect': 'Allow',
   'Resource': [{'Fn::Join': ['',
                              ['arn:aws:s3:::',
                               {'Ref': 'ExhibitorS3Bucket'},
                               '/*']]},
                {'Fn::Join': ['',
                              ['arn:aws:s3:::',
                               {'Ref': 'ExhibitorS3Bucket'}]]}]},
  {'Action': ['cloudformation:*'],
   'Effect': 'Allow',
   'Resource': [{'Ref': 'AWS::StackId'},
                {'Fn::Join': ['', [{'Ref': 'AWS::StackId'}, '/*']]}]}]]

In [42]:

Expected parents: 'AssumeRolePolicyDocument' and 'PolicyDocument'. Would appreciate how to fetch the parents. I am unable to understand how to parse stuff from context data. Here is how I can get the 'DatumInContext' (after which, I am lost).

In [18]: res = parse('$..Statement').find(x)
In [19]: pprint(res)
[DatumInContext(value=[{'Action': ['sts:AssumeRole'], 'Principal': {'Service': ['ec2.amazonaws.com']}, 'Effect': 'Allow'}], path=Fields('Statement'), context=DatumInContext(value={'Version': '2012-10-17', 'Statement': [{'Action': ['sts:AssumeRole'], 'Principal': {'Service': ['ec2.amazonaws.com']} ......

There are exactly 2 contexts given (as expected) - for each 'Statement'. And, these contexts have rich info, with full paths and so on. But, I do not know how to get the parent name. I do not know how to use the various functions like jsonPath.parent(), jsonPath.parent.find() and there is minimal to zero documentation regarding this.

Any help?

2

There are 2 best solutions below

0
On

You could directly use parent keyword in the jsonpath expression

res = parse('$..Statement.`parent`').find(x)

Hence

pprint( { str(m.path) : m.value["Statement"]  for m in res } )

gives

{'AssumeRolePolicyDocument': [{'Action': ['sts:AssumeRole'],
                               'Effect': 'Allow',
                               'Principal': {'Service': ['ec2.amazonaws.com']}}],
 'PolicyDocument': [{'Action': ['s3:AbortMultipartUpload',
                                's3:DeleteObject',
                                's3:PutObjectAcl'],
                     'Effect': 'Allow',
                     'Resource': [{'Fn::Join': ['',
                                                ['arn:aws:s3:::',
                                                 {'Ref': 'ExhibitorS3Bucket'},
                                                 '/*']]},
                                  {'Fn::Join': ['',
                                                ['arn:aws:s3:::',
                                                 {'Ref': 'ExhibitorS3Bucket'}]]}]},
                    {'Action': ['cloudformation:*'],
                     'Effect': 'Allow',
                     'Resource': [{'Ref': 'AWS::StackId'},
                                  {'Fn::Join': ['',
                                                [{'Ref': 'AWS::StackId'},
                                                 '/*']]}]}]}
0
On

Might be a hacky solution, but it worked for me. I was stuck exactly where you are. Try this:

res = [str(match.context.path) for match in parse('$..Statement').find(x)]

Let me try to explain how I arrived at the solution. If you print the match object itself like you did

In [18]: res = parse('$..Statement').find(x)
In [19]: pprint(res)

You'll see that it has multiple keys (thinking of it as a json for simplicity), and the key value is what you get when you do match.value. Now, check there are other keys as well, and context is among one of them. If you looks closely at the contents of context, you will find that it contains the json of full node on that level (the level where Statement is).

Inside the key context, you will see another key path. And if you do match.context.path, it should dump the parent object.

You are right, documentation is very minimal, hence had to hack my way around it.