Chinese text encoding missing characters when viewed in web browser

760 Views Asked by user2539827 At 29 July 2025 at 04:59

I have a HTML file which contains Chinese text. When I open the file in any web browser, there are characters which appear to be missing.

Here's an example copied from the browser window:

本函旨在邀請您參�� 定於

I know for a fact that all other characters seen here are correct aside from the missing ones (confirmed by a native Chinese speaker).

In the HTML header, I have a tag which signifies the file contains UTF-8 encoded characters:

<META http-equiv="Content-Type" content="text/html; charset=utf-8">

I've already tried some other charsets in this META tag, but so far it seems any encoding method I try aside from UTF-8 ends up looking worse.

I also considered the possibility that it is a font issue, so I installed 3 different traditional Chinese fonts on my system and forced Chrome to use them. None of them made any difference - missing characters were still present.

If I open the HTML file with Notepad++, here's what I can see:

https://i.stack.imgur.com/Ex3C1.png

If I select and copy-paste this text into regular MS Notepad, I get this:

本函旨在邀請您參劦nbsp;定於

So you can see here that the "xE5 x8A" visible in Notepad++ seems to have been replaced by 劦.

Is there any reason why the browser would be showing �� instead of 劦 in this scenario?

Original Q&A

There are 1 best solutions below

John Machin On 18 December 2016 at 09:52 BEST ANSWER

Look again at the HTML file.

I see the first 2 bytes of a character encoded in UTF-8, followed by ... let's imagine there was originally a \xA0, and this was mutated to   when the file was created by applying global substitutions to the UTF-8-encoded data.

However, \xE5\x8A\xA0 UTF-8 decodes to U+52A0 which is not the same as the alien character which is U+52A6 ... not close enough to an answer.

Chinese text encoding missing characters when viewed in web browser

There are 1 best solutions below

Related Questions in HTML

Related Questions in ENCODING

Related Questions in UTF-8

Related Questions in CHARACTER-ENCODING

Related Questions in CJK

Trending Questions

Popular # Hahtags

Popular Questions