I am having a problem with a PowerShell ConvertTo-Json command. The resulting file has two non-printable characters as the first to characters of the file. Using Format-Hex, the return values are:
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
00000000 FF FE 5B 00 0D 00 0A 00 20 00 20 00 20 00 20 00 .þ[..... . . . .
00000010 7B 00 0D 00 0A 00 20 00 20 00 20 00 20 00 20 00 {..... . . . . .
...
The bad characters are the FF and FE in the 00 and 01 positions. The command I am using to generate the file is the following:
$search.resources.attributes |select $Object.Attributes.ID |Sort-Object -Property displayname | ConvertTo-Json | Out-File ([Environment]::GetFolderPath("Desktop")+"\RL_Identities.json")
The $search is a result of an Invoke-RestMethod call.
I am importing this file into an Oracle CLOB column. This column has a CHECK (COLUMN_NAME IS JSON) check constraint and it fails with the two un-printable characters in the file. If I open the file in Notepad++, do a select all and copy/paste to a new file, the new file loads perfectly because it doesn't have the two characters at the beginning.
Is there any reason the two characters are there? Is it "feature" of the ConvertTo-Json command or could it be coming from the data in the Invoke-RestMethod call? If there is no way to prevent these characters from being there, is there a way to programmatically remove the first two bytes of the file?
Your file uses the "Unicode" (UTF-16LE) character encoding, which is what you get by default in Windows PowerShell when you use
>
/Out-File
.FF
andFE
, make up the so-called BOM (byte-order mark), aka Unicode signature, which identifies the encoding.Instead, use
Set-Content
(or-Out-File
) with the-Encoding
parameter to specify the desired encoding.The caveat is that if you need UTF-8,
-Encoding utf8
in Windows PowerShell creates UTF-8 files with a BOM, which not all consumers understand.>
).If you're on Windows PowerShell and need to create a BOM-less UTF-8 file, you can use the following workaround via
New-Item
:The sample
ConvertTo-Json
call results in verbatim"hü"
. Passing the resulting file toFormat-Hex
shows that the file has no BOM and thatü
(LATIN SMALL LETTER U WITH DIAERESIS,U+00FC
) is correctly encoded as UTF-8 byte sequence0xC3
,0xBC
: