=?gb2312 encoding issue and recomendation

387 Views Asked by At

I am writing java code to decode the incoming email traffic encoded using =?gb2312? and Not always MimeUtility.decodeText() succeeds with the chinese characters.
I see few recomendations to use =?gb18030? instead and it works for the set i tried.

Is it safe to replace gb2312 with gb18030 ?

2

There are 2 best solutions below

0
On BEST ANSWER

I was just looking into this for a customer the other day. You can use GBK or CP936 instead to get GB2312 to decode correctly.

0
On

Found out that GB18030 works well. Also found that java mail uses this mapping:

# Chinese charsets are a mess and widely misrepresented.
# gb18030 is a superset of gbk, which is a supserset of cp936/ms936,
# which is a superset of gb2312.
# https://bugzilla.gnome.org/show_bug.cgi?id=446783
# map all of these to gb18030.
gb2312      GB18030
cp936       GB18030
ms936       GB18030
gbk     GB18030