I have alot of JSON objects which seem to be exports of collections exported as one document which contains an array of the documents in the collection.
I am having a problem importing it to Mongodb as the way I am importing it seems to be treating it as one document and exceeding the 16mb limit (some of the files are 140mb)
The structure is:
{
"CollectionName": [
{
...
},
...
{
...
}
]
}
The sub documents in the array have a unique id as a attribute called "id", which im assuming was the original document id before being exported.
I am using PowerShell to execute mongoimport and import the collections. The code I have currently is
$collection = [regex]::Replace($_.FullName, "(?:C\:\\macship-inbound\\\d{4}\.\d{2}\.\d{2}\.\d{2}\.\d{2})", "")
$params = '--db', 'machship',
'--collection', "$collection",
'--type', 'json',
'--file', $_.FullName,
'--batchSize', '100',
'--numInsertionWorkers', '500',
& "C:\MongoDBTools\bin\mongoimport.exe" @params
I have tried adding --jsonArray to the parameters but that doesn't work.
I would like to import the json using the "CollectionName" as the collection name in the database, and then the sub documents in the array as each document in the collection.
Is this possible ? Happy to use a different approach or technology, just used powershell as it is easy to add to the task scheduler on the heavily locked down machine I am using.
Here is a very well massaged ChatGPT solution (using NodeJS) I eventually arrived at after MANY iterations of dealing with large imports hanging, slow import speed and scripts just hanging indefinitely with no apparent reason.
Hopefully this helps someone.
And i run it with a powershell script which gets all the child items in the folder and pipes them to the script individually.