SQL Server: How to convert an XML variable to UTF-8 encoded varbinary(max)?

2.1k Views Asked by At

In SQL Server the default conversion from the xml type to varbinary uses XML text encoding (start and end tags with angled brackets etc) using UTF-16 character encoding. E.g.

declare @xml xml = '<foo><bar>abc</bar></foo>';
declare @foo varbinary(max);
set @foo = CONVERT(varbinary(max), @xml, 2);
select @foo

Yields:

0xFFFE3C0066006F006F003E003C006200610072003E006100620063003C002F006200610072003E003C002F0066006F006F003E00

Notice the zeros, indicating the double byte characters. Is there a way to convert into UTF-8?

1

There are 1 best solutions below

0
On

Maybe it's not the best way to do this, but I solved this converting the xml to varchar first:

declare @xml xml = '<foo><bar>abc</bar></foo>';
declare @xmlvarchar varchar(max) = CONVERT(varchar(max), @xml)
declare @foo varbinary(max) = CONVERT(varbinary(max), @xmlvarchar);
select @foo

The output is:

0x3C666F6F3E3C6261723E6162633C2F6261723E3C2F666F6F3E

Of course, varchar depends on your database collation, in my case (non special characters) this solution works perfectly.