As Microsoft stated that:
Multibyte character sets, in particular the double-byte character sets (DBCS). Multibyte character sets provide a means to represent the large number of characters in many Asian languages.
DBCS code pages are used for languages such as Japanese and Chinese. In such a code page, some characters have two-byte encodings
So based on above, I have contradicting results: (2 out of 4 all possible cases, and I have three questions under 3 cases out of 4)
So Case 1(Contracditing):
- I asumme When I choose
Use Multi-Byte Character Set
, the following will automatically choose DBCS encoding:
string chineseString = "我是路人";
but instead compiler said:
warning C4566: character represented by universal-character-name '\u6211' cannot be represented in the current code page (1252)
which is contradicting the config itself, because 1252 is only western language encoding. Isn't is supposed to use the MBCS/DBCS here?
Case 2 (Understandble, non-contradicting):
- I choose "Use Unicode Character Set"
Now I assume I have to specify an encoding, so I will do like this:
string chineseString = u8"我是路人"
which works and makes sense for me.
Case 3(Contracdicting):
- I choose "Use Multi-Byte Character Set":
wstring chineseStringW = L"我是路人"
so is now using the encoding DBCS? If so, why string
does not pick up DBCS? or just because \u6211
fits in wchar_t
?
Case 4:
- I choose "Use Unicode Character Set":
wstring chineseStringW = L"我是路人"
so is it now the encoding UTF16-LE?