Powershell ForEach-Object column variables

708 Views Asked by At

Kind of have a strange problem. I have a large JSON file that needs to be processed. Based on another question I need to stream the file since it will otherwise gets me problems because of memory: JSON Powershell memory issue

What I have is this:

get-content -Path largefile.json | ForEach-Object {
$row = $_ = $_.TrimStart('[').TrimEnd(']')
if ($_) { $_ | Out-String | ConvertFrom-Json }
New-Item -Path $($Row.Id).txt
Set-Content -Path $($Row.Id).txt -Value ($row.Body)
}

I can easily do $row to publish the last processed row in the Largefile.json. I want to create a file with the name of the Id in the row that is currently processed and add the body column to the file. But when I want to show a specific column using $row.Id, unfortunately this shows up empty.

The structure of the Largefile.json is as followed:

[{"Id":"1","ParentId":"parent","Name":"filename","OwnerId":"owner","CreatedDate":"date","Body":"data1"}
{"Id":"2","ParentId":"parent","Name":"filename","OwnerId":"owner","CreatedDate":"date","Body":"data2"}
{"Id":"3","ParentId":"parent","Name":"filename","OwnerId":"owner","CreatedDate":"date","Body":"data3"}
{"Id":"4","ParentId":"parent","Name":"filename","OwnerId":"owner","CreatedDate":"date","Body":"data4"}
{"Id":"5","ParentId":"parent","Name":"filename","OwnerId":"owner","CreatedDate":"date","Body":"data5"}
]

The end result should be that I have 5 files:

  • 1.txt - Value inside the file should be: data1

  • 2.txt - Value inside the file should be: data2

  • 3.txt - Value inside the file should be: data3

  • 4.txt - Value inside the file should be: data4

  • 5.txt - Value inside the file should be: data5

I use Powershell 7.1.3

Is there any way that I can use $row.Id and $row.ParentId just like a regular ForEach would do?

thanks for your help.

4

There are 4 best solutions below

0
On

As others already explained, your json example is invalid.

However, since this is a huge file to process, you can use switch for this.

switch -Regex -File D:\Test\largefile.json {
    '"Id":"(\d+)".*"Body":"(\w+)"' { 
        Set-Content -Path ('D:\Test\{0}.txt' -f $matches[1]) -Value $matches[2]
    }
}

Results using your example would be 5 files called 1.txt .. 5.txt, each having a single line data1 .. data5

0
On

The question has many errors. Assuming the json has the missing commas in, I would do it this way, if I understand the question. This should work with the new updates to the question. I also have a more unusual solution involving streaming json with jq here: Iterate though huge JSON in powershell Json streaming support may be added later: ConvertFrom-JSON high memory consumption #7698

[{"Id":"ID","ParentId":"parent","Name":"filename","OwnerId":"owner","CreatedDate":"date","Body":"*******"},
 {"Id":"ID","ParentId":"parent","Name":"filename","OwnerId":"owner","CreatedDate":"date","Body":"*******"},
 {"Id":"ID","ParentId":"parent","Name":"filename","OwnerId":"owner","CreatedDate":"date","Body":"*******"},
 {"Id":"ID","ParentId":"parent","Name":"filename","OwnerId":"owner","CreatedDate":"date","Body":"*******"},
{"Id":"ID","ParentId":"parent","Name":"filename","OwnerId":"owner","CreatedDate":"date","Body":"*******"}
]
get-content -Path largefile.json | ForEach-Object {
  $_ = $_.TrimStart('[').TrimEnd(']').TrimEnd(',')
  if ($_) {
    $row = $_ | ConvertFrom-Json
    Set-Content -Path ($Row.Id + '.txt') -Value $row.Body
  }
}
get-content ID.txt

*******
1
On

I am still not sure what you expect as an outcome.
But I think you want to do this:

@'
[{"Id":"1","ParentId":"parent1","Name":"1.txt","OwnerId":"owner","CreatedDate":"date","Body":"Data1"}
{"Id":"2","ParentId":"parent2","Name":"2.txt","OwnerId":"owner","CreatedDate":"date","Body":"Data2"}
{"Id":"3","ParentId":"parent3","Name":"3.txt","OwnerId":"owner","CreatedDate":"date","Body":"Data3"}
{"Id":"4","ParentId":"parent4","Name":"4.txt","OwnerId":"owner","CreatedDate":"date","Body":"Data4"}
{"Id":"5","ParentId":"parent5","Name":"5.txt","OwnerId":"owner","CreatedDate":"date","Body":"Data5"}
]
'@ | Set-Content .\largefile.json

Get-Content .\largefile.json | ForEach-Object {
    $_ = $_.TrimStart('[').TrimEnd(']')
    If ($_) { 
        $Row = ConvertFrom-Json $_
        Set-Content -Path ".\$($Row.Name)" -Value $Row.Body
    }
}
2
On

It seems to me that this is what you're looking for:

Get-Content largefile.json | ForEach-Object {
    $row = $_.TrimStart('[').TrimEnd(']') | ConvertFrom-Json
    if ($null -ne $row) {
        Set-Content -Path ($row.Id) -Value ($row.Body)
    }
}