Gettext failed to extract non-ASCII characters

489 Views Asked by liftarn At 27 July 2025 at 17:58

In my source files I have string containing non-ASCII characters like

sCursorFormat = TRANSLATE("Frequency (Hz): %s\nDegree (°): %s");

But when I extract them they vanish like

msgid ""
"Frequency (Hz): %s\n"
"Degree (): %s"
msgstr ""

I have specified the encoding when extracting as

xgettext --from-code=UTF-8

I'm running under MS Windows and the source files are C++ (not that it should matter).

Original Q&A

There are 1 best solutions below

Dialecticus On 22 March 2022 at 12:13

The encoding of your source file is probably not UTF-8, but ANSI, which stands for whatever the encoding for non-Unicode applications is (probably code page 1252). If you would open the file in some hex editor you would see byte 0x80 standing for degree symbol. This byte is not a valid UTF-8 character. In UTF-8 encoding degree symbol is represented with two bytes 0xC2 0xB0. This is why the byte vanishes when using --from-code=UTF-8.

The solution for your problem is to use --from-code=windows-1252. OR, better yet, to save all source files as UTF-8, and then use --from-code=UTF-8.

Gettext failed to extract non-ASCII characters

There are 1 best solutions below

Related Questions in WINDOWS

Related Questions in GETTEXT

Related Questions in XGETTEXT

Trending Questions

Popular # Hahtags

Popular Questions