How do I import a JSON file into a JSONiq collection?

611 Views Asked by At

I have looked everywhere, and even the JSONiq documentation says "this is beyond the scope of this document." I have a JSON file (an array of JSON objects) I want to import into JSONiq (particularly Zorba, which by the way is a terrible name because it makes Internet searches for information futile) to use as a collection to query. Is there a tutorial, or spec, or anything anywhere that tells me how to do this?

2

There are 2 best solutions below

0
On BEST ANSWER

Zorba supports adding documents to a collection. The framework for doing so is documented here. Note, however, that Zorba is a memory store and will not persist anything beyond the scope of one query, so that this is of limited use without a persistence layer.

If the use case is simply to query a JSON file stored on your local drive, then it may be simpler to use EXPath's file module as well as parse-json, like so:

jsoniq version "1.0";

import module namespace file = "http://expath.org/ns/file";

let $my-object := parse-json(file:read-text("/path/to/document.json"))
return $my-object.foo

The above query returns "bar" if /path/to/document.json contains

{ "foo" : "bar" } 

parse-json gives you additional options to parse documents with multiple objects (JSON lines, etc).

For advanced users, this is how to use collections to avoid reading the file(s) every time:

jsoniq version "1.0";

import module namespace file = "http://expath.org/ns/file";
import module namespace ddl = "http://zorba.io/modules/store/dynamic/collections/ddl";
import module namespace dml = "http://zorba.io/modules/store/dynamic/collections/dml";

(: Populating the collection :)
variable $my-collection := QName("my-collection");
ddl:create($my-collection, parse-json(file:read-text("/tmp/doc.json")));

(: And now the query :)

for $object in dml:collection($my-collection)
group by $value := $object.foo
return {
  "value" : $value,
  "count" : count($object)
}

This is /tmp/doc.json:

{ "foo" : "bar" }
{ "foo" : "bar" }
{ "foo" : "foo" }
{ "foo" : "foobar" }
{ "foo" : "foobar" }

And the query above returns:

{ "value" : "bar", "count" : 2 }
{ "value" : "foobar", "count" : 2 }
{ "value" : "foo", "count" : 1 }
0
On

For the sake of completeness, for Rumble, a distributed JSONiq implementation on Spark, JSON files are read with json-doc() (when spread over multiple lines) or json-line() (where there is one JSON value per line, on possibly billions of lines).