czech char 'ě' on php page script

1.9k Views Asked by At

I'm not able to correctly show this char on my web pages. I'm using UTF-8 charset for this page, have I to use ISO-8859-2? I'm getting this a string with this char from a db and on it, it's saved as ě. My Browser show only html tag.

It's the only char (at this moment) that I can't show on my webpage. I've take a look to the http://www.czech.cz and they use UTF-8.

any suggests?

take care! Andrea

2

There are 2 best solutions below

2
On

Are you seeing the ě in the browser, or when you view source? If you're seeing it in the browser, then it's probably being double-encoded somewhere -- whatever outputs it to the page is probably detecting it as unencoded HTML and is trying to protect you from some kind of HTML-injection. You'll want to make it not do that. But you have an even deeper problem. If your page is served up in UTF-8, and your data is in UTF-8, there isn't any reason to turn it into an HTML entity in the first place. You should be passing through the UTF-8 data. You do not need to switch to a different character encoding.

7
On

First of all, yes, you really should be using UTF-8. But that doesn't mean the data you have is already UTF-8 encoded.

Secondly, it sounds like that character is HTML encoded in the database already. This is a problem, because it seems that whatever page is displaying this character also tries to HTML-encode the content as well. Here's an example of what I'm talking about.

Data from user: ě
Data HTML encoded (via htmlentities()) prior to going into DB: ě
Data stored in DB: ě
Data retrieved from DB: ě
Data HTML encoded before being printed to the page: ě
Data as seen in the browser: ě

Do you see that? The character becomes double encoded, so that on the 2nd encoding step the ampersand character is converted into an entity itself.

This is the problem with HTML-encoding data before storing it in the database. That should only be done prior to displaying the content, not prior to storage.