advanced JSON query language

1.9k Views Asked by At

I've explored couple of existing JSON query language such JMESPath, JsonPath and JSONiq. Unfortunately, none of them seem to be able to support my use case in a generic way.

Basically, I'm receiving different type of responses from different web services. I need to give the ability to the user to remap the response in a 2 dimensional array in other to leverage our visualization tool. Based on the new format, the user can decide how to display his data between existing widgets. Pretty much like a customisable dashboard entirely managed on the UI.

Anyway my input looks like:

{
  "category_1": [
    {
      "name": "medium",
      "count": 10
    },
    {
      "name": "high",
      "count": 20
    }
  ],
  "category_2": [
    {
      "name": "medium",
      "count": 30
    },
    {
      "name": "high",
      "count": 40
    }
  ]
}

expected output:

[
  {
    "name": "medium",
    "count": 10,
    "category": "1"
  },
  {
    "name": "high",
    "count": 20,
    "category": "1"
  },
  {
    "name": "medium",
    "count": 30,
    "category": "2"
  },
  {
    "name": "high",
    "count": 40,
    "category": "2"
  }
]

The closer I went is with JMESPath but my query isn't dynamic at all. The user needs to be aware of possible category of grouping.

The query looks like: [ category_1[].{name: name, count: count, category: '1'}, category_2[].{name: name, count: count, category: '2'} ] | []

In other words, I need an enough powerful JSON query language to perform this JavaScript code:

const output = flatMap(input, (value, key) => {
  return value.map(x => {
    return { ...x, category: key };
  });
});

Any thoughts?

5

There are 5 best solutions below

2
On BEST ANSWER

This is indeed not currently possible in JMESPath (0.15.x). There are other spec compliant JMESPath packages that (with a bit of extra effort) will do what you require. Using NPM package @metrichor/jmespath (a typescript implementation) you could extend it with the functions you require as follows:


import {
  registerFunction,
  search,
  TYPE_ARRAY,
  TYPE_OBJECT
} from '@metrichor/jmespath';

registerFunction(
  'flatMapValues',
  ([inputObject]) => {
    return Object.entries(inputObject).reduce((flattened, entry) => {
      const [key, value]: [string, any] = entry;

      if (Array.isArray(value)) {
        return [...flattened, ...value.map(v => [key, v])];
      }
      return [...flattened, [key, value]];
    }, [] as any[]);
  },
  [{ types: [TYPE_OBJECT, TYPE_ARRAY] }],
);

With these extended functions a JMESPath expression would now look like this to remap the key into every value:

search("flatMapValues(@)[*].merge([1], {category: [0]})", {
  "category_1": [
    {
      "name": "medium",
      "count": 10
    },
    {
      "name": "high",
      "count": 20
    }
  ],
  "category_2": [
    {
      "name": "medium",
      "count": 30
    },
    {
      "name": "high",
      "count": 40
    }
  ]
});

// OUTPUTS:

[
  {
    category: 'category_1',
    count: 10,
    name: 'medium',
  },
  {
    category: 'category_1',
    count: 20,
    name: 'high',
  },
  {
    category: 'category_2',
    count: 30,
    name: 'medium',
  },
  {
    category: 'category_2',
    count: 40,
    name: 'high',
  },
]

That said you could just register the function you wrote above and use it

1
On

Finally, managed a way with JSONiq using Zorba implementation. Definitively the way to go if you need powerful JSON queries. Apparently this has been integrated in Apache Spark with Rumble

Anyway, here's my solution:

jsoniq version "1.0";

let $categories := 
{
  "category_1": [
    {
      "name": "medium",
      "count": 10
    },
    {
      "name": "high",
      "count": 20
    }
  ],
  "category_2": [
    {
      "name": "medium",
      "count": 30
    },
    {
      "name": "high",
      "count": 40
    }
  ]
}

for $key in keys($categories), $row in flatten($categories.$key)
    return {"count": $row.count, "name": $row.name, "category": $key}

output:

{ "count" : 10, "name" : "medium", "category" : "category_1" }{ "count" : 20, "name" : "high", "category" : "category_1" }{ "count" : 30, "name" : "medium", "category" : "category_2" }{ "count" : 40, "name" : "high", "category" : "category_2" }

You can try Zorba here.

3
On

This is an alternative possibility in JSONiq that does not explicitly list the keys in each row, with the merge constructor {| |}:

jsoniq version "1.0";

let $categories := 
{
  "category_1": [
    {
      "name": "medium",
      "count": 10
    },
    {
      "name": "high",
      "count": 20
    }
  ],
  "category_2": [
    {
      "name": "medium",
      "count": 30
    },
    {
      "name": "high",
      "count": 40
    }
  ]
}
for $key in keys($categories),
    $row in members($categories.$key)
return {|
  $row,
  { "category": $key }
|}

For the sake of completeness, this is the reverse query that would turn the output back into the original input (which uses a group by clause):

jsoniq version "1.0";
let $output :=
(
  { "count" : 10, "name" : "medium", "category" : "category_1" },
  { "count" : 20, "name" : "high", "category" : "category_1" },
  { "count" : 30, "name" : "medium", "category" : "category_2" },
  { "count" : 40, "name" : "high", "category" : "category_2" }
)
return
{|
  for $row in $output
  group by $category := $row.category
  return { $category : [ $row ] }
|}
1
On

You actually don't need any additional libs for that. Here is a small function which does the trick. You only need to split the key.

const transform = (obj) => {
    const ret = [];
    for (let key in obj) {
        const tmp = key.split('_');
        for (let item of obj[key]) {
            ret.push({
                ...item,
                [tmp[0]]: tmp[1],
            });
        }
    }
    return ret;
};

const result = transform(obj);
2
On

This is simple with ~Q (disclaimer: I'm the developer).

{
   "results:{}:[]": [{
       "{}:":".",
       "category":"$key"
   }]
}

Output:

{
    "results": [
        {
            "name": "medium",
            "count": 10,
            "category": "category_1"
        },
        {
            "name": "high",
            "count": 20,
            "category": "category_1"
        },
        {
            "name": "medium",
            "count": 30,
            "category": "category_2"
        },
        {
            "name": "high",
            "count": 40,
            "category": "category_2"
        }
    ]
}

Edit: some more info to explain the syntax:

"results:{}:[]"

The :{} part means "iterate over all keys in the object", :[] means "iterate over all array elements".

"{}:":"."

This copies each field in the current object to the output.

"category":"$key"

Add a field called "category", with the current traversed key as value.

If we wanted to get the numbers (i.e. 1,2,... instead of category_1, category_2, etc), we can use substr:

"category": "$key substr(9)"