What's the most efficient way to decode a UTF16 binary?

232 Views Asked by At

As Rebol 3 supports unicode, and UTF16 is used internally when needed (if it has only ASCII characters, it's in ASCII), it should be as simple as copying the memory content from the binary and setting up the REBVAL structure. However, the only way I find seems to be iterating over the binary and converting each character individually.

Same question applies to encoding a string in UTF16.

1

There are 1 best solutions below

2
On BEST ANSWER

OK, there doesn't seem to be an easy way to do it. So I just added two codecs UTF-16LE/BE for this purpose. See this commit: https://github.com/zsx/r3/commit/630945070eaa4ae4310f53d9dbf34c30db712a21

With this change, you can do:

>> b: encode 'utf-16le "hello"
== #{680065006C006C006F00}

>> s: decode 'utf-16le b       
== "hello"

>> b: encode 'utf-16be "hello" 
== #{00680065006C006C006F}

>> s: decode 'utf-16be b 
== "hello"