Displaying Unicode characters above U+FFFF on Windows

Question

Displaying Unicode characters above U+FFFF on Windows

1.3k Views Asked by hrniels At 19 April 2025 at 11:25

the application I'm developing with EVC++ 4 runs on Windows CE 5 and should support unicode (AFAIK wchar_t uses UTF-16 on windows, so I'm using that), so I want to be able to test it with "more exotic" characters. Especially with characters that use 4 Byte in UTF-16 and not just 2. Therefore I'm trying to display such characters in a texteditor (atm on my desktop PC with Windows XP, not on the embedded device).

But I haven't managed it to do so yet. As an example I've chosen this character. Like mentioned here "MPH 2B Damase" should support this character. So I downloaded the font and put it into Windows\Fonts. I created a textfile using a hexeditor (just to be sure) with following content:

FFFE D802 DC00

When I open it with notepad (which should be unicode-capable, right?) and use the downloaded font it doesn't display 1 char, as intended, but this 2:

˘Ü

What am I doing wrong? :)

Thanks!

hrniels

Edit: Flipping the BOM, as suggested, doesn't work. Notepad (and all other editors I tried, too) displays two squares in this case. Interesting is that if I copy the two squares here (with firefox) I see the right character:

I've also tried it with Komodo Edit with the same result.

Using UTF-8 doesn't help notepad either.

Original Q&A

There are 3 best solutions below

**sorin** · Answer 1

Probably you forgot to read the _wfopen() documentation. There they specify the encoding parameter. BTW, I assumed you are already using Unicode (wchars).

I would recommend you to use UTF-8 in files with or without BOM but forcing your fopen to use UTF-8 flag. It looks _wfopen("newfile.txt", "r, ccs=UTF-8"); will work with UTF-8 with or without BOM and also with UTF-16. Do not make the mistake of using the ccs=Unicode, it is a common thing to have UTF-8 files without BOM.

You should really read a little bit about Unicode before trying to work. This about this as a very good investment - it will save you time if you understand how Unicode works.

Here is a start http://blog.i18n.ro/newbie-guide-to-unicode/ and do not forget to read the links from the end of the article.

If you really need a simple text editor that allows you to play with Unicode encodings, use Notepad++ and forget about Notepad.

**Skurmedel** · Answer 2

Your text editor might not like UTF-16. It probably assumes ANSI or UTF-8.

Try typing in the UTF-8 equivalent instead:

0xF0 0x90 0xA0 0x80

This won't help your testing, but will make sure your font isn't at fault. A text editor that does support UTF-16 is Komodo Edit.

**AudioBubble** · Answer 3

What happens if you put the byte order mark the other way around?

FEFF D802 DC00

(At the moment the byte sequence is being interpreted as the two characters U+02D8 U+00DC, so hopefully flipping the BOM will cause the bytes to be read in the intended order)

Displaying Unicode characters above U+FFFF on Windows

There are 3 best solutions below

Related Questions in UNICODE

Related Questions in WINDOWS-XP

Related Questions in UTF-16

Related Questions in ASTRAL-PLANE

Trending Questions

Popular # Hahtags

Popular Questions