Hey I am looking to deduplicate some data and combine columns from a CSV. Can't get my head around how to do it. Here is a sample of the data I am working with:
cmmc,stig,descr
AC.1.001,SV-205663r569188_rule,The ability to set access permissions and auditing is critical to maintaining the security and proper access controls of a system. To support this volumes must be formatted using a file system that supports NTFS attributes.
AC.1.001,SV-205667r569188_rule,Inappropriate granting of user rights can provide system administrative and other high-level capabilities.
AC.1.002,SV-205663r569188_rule,The ability to set access permissions and auditing is critical to maintaining the security and proper access controls of a system. To support this volumes must be formatted using a file system that supports NTFS attributes.
AC.1.002,SV-205665r569188_rule,Enterprise Domain Controllers groups on domain controllers.
I am pretty close to the data I am looking for but struggling to add the |<value of 'descr'>
after the item in the second column:
Here's my script:
Import-CSV '.\input.csv' | Group-Object 'cmmc' |
ForEach-Object {
[PsCustomObject]@{
cmmc = $_.name
stig = $_.group.stig -Join '
'
}
} | Export-Csv '.\output.csv' -NoTypeInformation
The output looks like this (formatted for readability, column names omitted):
AC1.001 SV-205663r569188_rule
SV-205665r569188_rule
AC1.002 SV-205663r569188_rule
SV-205665r569188_rule
But I am looking for this:
AC.1.001 SV-205663r569188_rule|The ability to set access permissions and auditing is critical to maintaining the security and proper access controls of a system. To support this volumes must be formatted using a file system that supports NTFS attributes.
SV-205667r569188_rule|Inappropriate granting of user rights can provide system administrative and other high-level capabilities.
AC.1.002 SV-205663r569188_rule|The ability to set access permissions and auditing is critical to maintaining the security and proper access controls of a system. To support this volumes must be formatted using a file system that supports NTFS attributes.
SV-205665r569188_rule|Enterprise Domain Controllers groups on domain controllers.
Use the following, which makes use of calculated properties in combination with the
Select-Object
cmdlet applied to the results from yourGroup-Object
call:Note:
• The
[array]
type constraints for$stigs
and$descrs
are needed to handle the case where a group comprises only one record, in which case$_.Group.sig
and$_.Group.descr
, due to the behavior of member-access enumeration, return only a single string rater than a single-element array; without the[array]
cast, indexing (e.g.[$i]
) would then be performed on[string]
instances, which would return the a single character at that position from the string.• In the
Export-Csv
call, adjust-Encoding
as needed. BOM-less UTF-8 is now the default in PowerShell (Core) 7+, and-NoTypeInformation
is no longer required there.The resulting file has the following content, showing the use of column-internal newlines (which are protected by the value as a whole being enclosed in
"..."
):To visualize that this yields the desired data, you can re-import the resulting file and pipe it to
Format-Table
with the-Wrap
switch:Note that
-Wrap
respects the property-internal newlines, but additionally breaks individual lines into multiple ones if they're too wide for the console window.