How to use gradle mlExportToFile and REST Transform to Create NDJSON File

76 Views Asked by At

I am currently using gradle mlExportToFile to export json documents to a file. However, I need the final file to be in newline delimited JSON format, with each document on its own line within the file. I saw that the mlExportToFile + REST transform method could be used to create CSVs, so I am assuming this approach will also work for NDJSON. I've tried just about everything I can think of and the file is still not putting records on their own lines. Any advice is greatly appreciated!!

1

There are 1 best solutions below

1
On

I think the limitation here is that when the underlying ExportToWriterListener (from the ML Java Client / DMSDK library) will write the document to the File in the format that it's retrieved from ML, which will include newlines by default in it. The ml-gradle task is just a single line of code - runQueryBatcherJob(new ExportToFileJob()) - so you could easily customize your own task.

I believe you can reuse ExportToFileJob (which is in the marklogic-data-movement-components project) - it has a getExportListener() method on it that returns the ExportToWriterListener (in the marklogic-client-api project). And that has an onGenerateOutput(OutputListener) method, where you can write an OutputListener that smushes the JSON into a single line.

That could certainly become a new feature in ml-gradle too; i.e. a property such as -PnewlineDelimitedJson=true.

Feel free to raise a ticket in the ml-gradle project for this (though the enhancement will actually be made in the marklogic-data-movement-components project).