the relationship between Visual Studio "Character Set Configuration" and the encoding scheme?

212 Views Asked by At

As Microsoft stated that:

Multibyte character sets, in particular the double-byte character sets (DBCS). Multibyte character sets provide a means to represent the large number of characters in many Asian languages.

DBCS code pages are used for languages such as Japanese and Chinese. In such a code page, some characters have two-byte encodings

So based on above, I have contradicting results: (2 out of 4 all possible cases, and I have three questions under 3 cases out of 4)

So Case 1(Contracditing):

  • I asumme When I choose Use Multi-Byte Character Set, the following will automatically choose DBCS encoding:

string chineseString = "我是路人";

but instead compiler said:

warning C4566: character represented by universal-character-name '\u6211' cannot be represented in the current code page (1252)

which is contradicting the config itself, because 1252 is only western language encoding. Isn't is supposed to use the MBCS/DBCS here?

Case 2 (Understandble, non-contradicting):

  • I choose "Use Unicode Character Set"

Now I assume I have to specify an encoding, so I will do like this:

string chineseString = u8"我是路人"

which works and makes sense for me.

Case 3(Contracdicting):

  • I choose "Use Multi-Byte Character Set": wstring chineseStringW = L"我是路人"

so is now using the encoding DBCS? If so, why string does not pick up DBCS? or just because \u6211 fits in wchar_t?

Case 4:

  • I choose "Use Unicode Character Set": wstring chineseStringW = L"我是路人"

so is it now the encoding UTF16-LE?

0

There are 0 best solutions below