Is there a linux/mac command to copy multiple multi-line json files to a single ndjson file?
item1.json
{
"type": "Feature",
"version": "1.0.0",
"id": "item1"
}
item2.json
{
"type": "Feature",
"version": "1.0.0",
"id": "item2"
}
Wanted result: items.json
{"type": "Feature", "version": "1.0.0", "id": "item1"}
{"type": "Feature", "version": "1.0.0", "id": "item2"}
In a valid JSON document,
\u0000to\u001f) that are used inside of strings must be escapedNewline characters (
\n,\u000a) fall into both categories (carriage return characters (\r,\u000d) as well, if that matters). So, inside of strings they cannot exist in their plain form, and outside of them they are insignificant. Thus, you can safely just remove all occurrences by using any capable tool, including JSON-agnostic ones, to bring a JSON file down to a single line.As for creating an NDJSON file out of many multi-line JSON files, a straightforward approach could be to have a
forloop successively provide all the JSON files,trto delete the line breaks from each, followed by a simpleechoto generate the delimiter line breaks in the target file:An easier approach producing the same result could be using serializing
paste -swhich simply concatenates all lines of an input file into one using a delimiter. Set the delimiter to the empty string using-d ''to override the default TAB:If you also wanted to compact the JSONs by removing regular space characters, be aware that they only fall into the whitespace category, not the control characters category, so only the ones outside of strings (like those used for indentation) can be deleted without logically altering the document. Hence, you'd be better off using a proper JSON parser, as it can reliably determine whether a given character in the representation is part of an encoded string or not. One such JSON-parsing CLI tool would be jq which comes with a dedicated
--compact-output(or-c) flag:Demo for jq on jqplay.org