I have a single text file that contains 60K+ lines in it. Those 60K+ lines are actually around 50 or so programs written in Natural. I need to break them apart into individual programs. I have a script that works perfectly with a single flaw. The naming of the output files.
Every program starts with "Module Name=", followed by the actual name of the program. I need to split the programs and save them using the actual program names.
Using the example below, I would like to create two files called Program1.txt and Program2.txt each containing the lines belonging to them. I have a script, also below, that separates the files correctly, but I am unable to discern the correct way to capture the Program name and use that as the name of the output file.
Example:
Module Name=Program1
....
....
....
END
Module Name=Program2
....
....
....
END
Code:
$InputFile = "C:\Natural.txt"
$Reader = New-Object System.IO.StreamReader($InputFile)
$a = 1
While (($Line = $Reader.ReadLine()) -ne $null) {
If ($Line -match "Module Name=") {
$OutputFile = "MySplittedFileNumber$a.txt"
$a++
}
Add-Content $OutputFile $Line
}
Combine a
switch
statement, which can read files line by line efficiently with-File
and can match each line against regex(es) with-Regex
, and use aSystem.IO.StreamWriter
instance to write the output files efficiently:Note:
The code assumes that the very first line in the input file is a
Module Name=...
line.The regex matching is case-insensitive by default, as PowerShell generally is; add
-CaseSensitive
, if needed.The automatic
$Matches
variable is used to extract the program name from the matching result.