How to produce a flat list of all values from a given JSON

143 Views Asked by At

I would like to produce a list of all values from any given JSON. The special requirement is that for values in nested attributes, all the values of parent's attributes must be repeated in the line. No keys must be printed.

For example, from this input JSON:

{
   "results":[
      {
         "id":306,
         "name":"First Company",
         "branches":[
            {
               "id":4191,
               "city":"Seattle",
               "customers":[
                  {
                     "id":446,
                     "name":"Big Tech 1"
                  },
                  {
                     "id":447,
                     "name":"Big Tech 2"
                  }
               ]
            },
            {
               "id":4192,
               "city":"Oakland",
               "customers":[
                  {
                     "id":448,
                     "name":"Health Tech 1"
                  },
                  {
                     "id":449,
                     "name":"Health Tech 2"
                  }
               ]
            }
         ]
      }
   ]
}

I would like to produce this output (please notice the repeated values are highlighted in red, but the output should be with no colors; no keys should be printed):

enter image description here

The JSON above is just an example and generic JSON with arbitrary depth of nesting of JSON objects must be assumed, which suggests that the processing must be recursive. If null appears they must be printed as well.

3

There are 3 best solutions below

2
pmf On

Iterate and nest:

jq -r '
  .results[] | [.id, .name] + (
    .branches[] | [.id, .city] + (
      .customers[] | [.id, .name]
    )
  ) | @tsv
'
306 First Company   4191    Seattle 446 Big Tech 1
306 First Company   4191    Seattle 447 Big Tech 2
306 First Company   4192    Oakland 448 Health Tech 1
306 First Company   4192    Oakland 449 Health Tech 2

Demo

To keep the values JSON-encoded (quotes around strings), you could re-convert them using @json

jq -r '
  .results[] | [.id, .name] + (
    .branches[] | [.id, .city] + (
      .customers[] | [.id, .name]
    )
  ) | map(@json) | @tsv
'
306 "First Company" 4191    "Seattle"   446 "Big Tech 1"
306 "First Company" 4191    "Seattle"   447 "Big Tech 2"
306 "First Company" 4192    "Oakland"   448 "Health Tech 1"
306 "First Company" 4192    "Oakland"   449 "Health Tech 2"

Demo

Alternatively (to the JSON encoding), you could use the @csv builtin, which also escapes strings but also separates the items with a comma:

jq -r '
  .results[] | [.id, .name] + (
    .branches[] | [.id, .city] + (
      .customers[] | [.id, .name]
    )
  ) | @csv
'
306,"First Company",4191,"Seattle",446,"Big Tech 1"
306,"First Company",4191,"Seattle",447,"Big Tech 2"
306,"First Company",4192,"Oakland",448,"Health Tech 1"
306,"First Company",4192,"Oakland",449,"Health Tech 2"

Demo

3
peak On

The following makes various assumptions but only hardcodes "results", "branches" and "customers", and does not assume that the ordering of the keys within the customer objects is uniform. Using scalar_keys in the obvious way, the solution can also be readily extended to allow for greater depth, and indeed could be generalized to compute the array-valued keys.

def scalar_keys:
  [ keys_unsorted[] as $k
    | .[$k]
    | scalars
    | $k ];
    
.results[]
| [.[scalar_keys[]] ] as $top
| .branches[]
| [.[scalar_keys[]] ] as $intermediate
| .customers
| (first | keys_unsorted) as $keys
| .[]
| $top + $intermediate + [.[$keys[]]]

This produces arrays, so you will want to tack on a call to @tsv or @csv or similar.

0
sudocracy On

Here's a simple solution that does not depend on the structure of the JSON (demo, another demo):

def unroll:
    if type == "object" 
    then [.[] | scalars] + ((.[] | iterables | unroll) // [] ) 
    elif type == "array" 
    then (.[] | unroll)
    else [.] 
    end;
    
unroll | @tsv

It recursively descends through the object graph, but when it encounters an object, it extracts all the scalars so that on the way up (in the recursion), the scalars in the path become one array. When an array is encountered though, it just acts as a pass-through and unrolls the array.