Nested json filtering using jq or cel

99 Views Asked by At

Need to return payload.* json path where condition payload...cities.name = city_x matches. Tried looking google cel or jq for this case. did not find way for this so far.

Case 1 - Sample Json Input

{
    "payload": {
        "book1": {
            "isbn": "123-456-2221",
            "author": {
                "lastname": "Doe",
                "firstname": "Jane"
            },
            "editor": {
                "lastname": "Smith",
                "firstname": "Jane"
            },
            "title": "The Ultimate Database Study Guide",
            "category": [
                "Non-Fiction",
                "Technology"
            ],
            "sellerid1": {
                "name": "sellerid1",
                "cities": [{
                        "name": "city_a",
                        "sellcount": 100
                    },
                    {
                        "name": "city_y",
                        "sellcount": 200
                    }
                ]
            },
            "sellerid2": {
                "name": "sellerid2",
                "cities": [{
                        "name": "city_a",
                        "sellcount": 10
                    },
                    {
                        "name": "city_b",
                        "sellcount": 15
                    }
                ]
            }
        },
        "book2": {
            "isbn": "123-456-222",
            "author": {
                "lastname": "Doe",
                "firstname": "Jane"
            },
            "editor": {
                "lastname": "Smith",
                "firstname": "Jane"
            },
            "title": "C programming guide",
            "category": [
                "Programming",
                "Technology"
            ],
            "sellerid3": {
                "name": "sellerid3",
                "cities": [{
                        "name": "city_a",
                        "sellcount": 100
                    },
                    {
                        "name": "city_b",
                        "sellcount": 200
                    }
                ]
            },
            "sellerid4": {
                "name": "sellerid4",
                "cities": [{
                        "name": "city_a",
                        "sellcount": 10
                    },
                    {
                        "name": "city_x",
                        "sellcount": 15
                    }
                ]
            }
        }
    }
}

######## Sample json case 2 ###############

{
    "payload": {
        "book1": {
            "isbn": "123-456-2221",
            "author": {
                "lastname": "Doe",
                "firstname": "Jane"
            },
            "editor": {
                "lastname": "Smith",
                "firstname": "Jane"
            },
            "title": "The Ultimate Database Study Guide",
            "category": [
                "Non-Fiction",
                "Technology"
            ],
            "sellers": {
                "sellerid1": {
                    "name": "sellerid1",
                    "cities": [{
                            "name": "city_a",
                            "sellcount": 100
                        },
                        {
                            "name": "city_y",
                            "sellcount": 200
                        }
                    ]
                },
                "sellerid2": {
                    "name": "sellerid2",
                    "cities": [{
                            "name": "city_a",
                            "sellcount": 10
                        },
                        {
                            "name": "city_b",
                            "sellcount": 15
                        }
                    ]
                }
            }
        },
        "book2": {
            "isbn": "123-456-222",
            "author": {
                "lastname": "Doe",
                "firstname": "Jane"
            },
            "editor": {
                "lastname": "Smith",
                "firstname": "Jane"
            },
            "title": "C programming guide",
            "category": [
                "Programming",
                "Technology"
            ],
            "sellers": {
                "sellerid3": {
                    "name": "sellerid3",
                    "cities": [{
                            "name": "city_a",
                            "sellcount": 100
                        },
                        {
                            "name": "city_b",
                            "sellcount": 200
                        }
                    ]
                },
                "sellerid4": {
                    "name": "sellerid4",
                    "cities": [{
                            "name": "city_a",
                            "sellcount": 10
                        },
                        {
                            "name": "city_x",
                            "sellcount": 15
                        }
                    ]
                }
            }
        },
        "book3": {
            "isbn": "123-456-2221",
            "author": {
                "lastname": "Doe",
                "firstname": "Jane"
            },
            "editor": {
                "lastname": "Smith",
                "firstname": "Jane"
            },
            "title": "The Ultimate Database Study Guide",
            "category": [
                "Non-Fiction",
                "Technology"
            ]
        },
        "book4": {
            "isbn": "123-456-222",
            "author": {
                "lastname": "Doe",
                "firstname": "Jane"
            },
            "editor": {
                "lastname": "Smith",
                "firstname": "Jane"
            },
            "title": "C programming guide",
            "category": [
                "Programming",
                "Technology"
            ],
            "sellers": {}
        }
    }
}

required output is same for case1 and case2.

{
    "payload": {
        "book2": {
            "isbn": "123-456-222",
            "author": {
                "lastname": "Doe",
                "firstname": "Jane"
            },
            "editor": {
                "lastname": "Smith",
                "firstname": "Jane"
            },
            "title": "C programming guide",
            "category": [
                "Programming",
                "Technology"
            ],
            "sellerid3": {
                "name": "sellerid3",
                "cities": [{
                        "name": "city_a",
                        "sellcount": 100
                    },
                    {
                        "name": "city_b",
                        "sellcount": 200
                    }
                ]
            },
            "sellerid4": {
                "name": "sellerid4",
                "cities": [{
                        "name": "city_a",
                        "sellcount": 10
                    },
                    {
                        "name": "city_x",
                        "sellcount": 15
                    }
                ]
            }
        }
    }
}

Tried CEL to filter , CEL playground. Nested filtering this style did not find option in CEL. Looking if any other way using jq.

2

There are 2 best solutions below

2
On

This keeps all those fields under .payload that have at least one path .[].cities[].name that is resolvable and evaluates to "city_x". Consequently, other fields are removed.

.payload |= with_entries(select(IN(.value[].cities?[]?.name; "city_x")))
{
  "payload": {
    "book2": {
      "isbn": "123-456-222",
      "author": {
        "lastname": "Doe",
        "firstname": "Jane"
      },
      "editor": {
        "lastname": "Smith",
        "firstname": "Jane"
      },
      "title": "C programming guide",
      "category": [
        "Programming",
        "Technology"
      ],
      "sellerid3": {
        "name": "sellerid3",
        "cities": [
          {
            "name": "city_a",
            "sellcount": 100
          },
          {
            "name": "city_b",
            "sellcount": 200
          }
        ]
      },
      "sellerid4": {
        "name": "sellerid4",
        "cities": [
          {
            "name": "city_a",
            "sellcount": 10
          },
          {
            "name": "city_x",
            "sellcount": 15
          }
        ]
      }
    }
  }
}

Demo

0
On

Solution

Here's an approach that will keep only those books that have cities[].name somewhere in their descendants that equals city_x (demo):

.payload[] |= select( [ .. | if (type == "object" and has("cities")) 
                             then (.cities[] | .name == "city_x") 
                             else false 
                             end ] | any )
                           

A little more modular approach (demo):

def criteria($city):

    [ .. | if (type == "object" and has("cities")) 
           then (.cities[] | .name == $city) 
           else false 
           end ] | any;
         
.payload[] |= select( criteria("city_x") )

Explanation

The idea is that for each book produced by .payload[], we only want to let it pass through if it satisfies the criteria: select( criteria("city_x") )

The criteria function produces all possible descendants (..) but we are only interested in objects that have a cities property. If such an object is encountered, we unwarp the .cities property to iterate over each city and check if it's name matches city_x. Of course, we are only interested in retaining a book, if there is at least one match.

And so for each descendant, the logic produces false if the criteria is not satisfied, and true if it is. The final any checks that one of these elements is true, which means that the matched book should be included in the final result.

A caveat: if the same book has two cities named "city_x", this would produce two true elements in the array for the same input object but this is not a problem because we have the final any. If it can be assumed that this wouldn't occur, the logic can be made a little simpler by moving the any to the then clause (demo).