PHP/MySQL Encoding

346 Views Asked by At

I have a website, with arabic content which has been migrated from a different server. On the old server, everything was displaying correctly, supposedly everything was encoded with UTF-8.

On the current server, the data started displaying incorrectly, showing نبذة عن and similar characters.

The application is build on the CakePHP Framework.

After many trials, I changed the 'encoding' parameter in the MySql connection array to become 'latin1'. For the people who don't know CakePHP, this sets MySql's connection encoding. Setting this value to UTF8 did not change anything, even after the steps described below.

Some of the records started showing correctly in Arabic, while others remained gibberish.

I have already gone through all the database and server checks, confirming that:

  1. The database created is UTF-8.
  2. The table is UTF-8.
  3. The columns are not explicitly set to any encoding, thus encoded in UTF-8.
  4. Default Character set in PHP is UTF-8
  5. mysql.cnf settings default to UTF-8

After that, I retrieved my data and looped through it, printing the encoding of each string (from each row) using mb_detect_encoding. The rows that are displaying correctly are returning UTF8 while it is returning nothing for the rows that are corrupt.

The data of the website has been edited on multiple types, possibly with different encodings, this is something I cannot know for sure. What I can confirm though, is that the only 2 encodings that this data might have passed through are UTF-8 and latin1.

Is there any possible way to recover the data when mb_detect_encoding is not returning anything and the current dataset is unknown?

UPDATE: I have found out that while the database was active on the new server, the my.cnf was updated. The below directive was changed:

character-set-server=utf8

To

default-character-set=utf8

I am not sure how much this makes a difference though.

Checking the modified dates, I can conclude to a certain degree of certainty that the data I could recover was not edited on the new server, while the data I couldn't retrieve has been edited.

3

There are 3 best solutions below

3
On

Try to fix the problem from DB side .. not from php or DB connection

I advice you to go to your old server and export your DB again with character set UTF8

then after import it to a new server .. be sure that you can see the arabic characters inside the tables(with phpmyadmin) if your tables looks fine ..

then you can move to check the next

  • DB connection

  • php file encoding

  • the header encoding in html

as I know if the problem from the DB .. there is no way without export the data again from the old server

Edit:

if you do not have access to your old DB please check this answer it can help you

4
On

You were expecting نبذة عن? Mojibake. See duplicate for discussion and solution, including how to recover the data via a pair of ALTER TABLEs.

0
On

I had a similar problem with migrating database tables encoded with utf8 from a public server to localhost. The resolution was in setting the localhost server encoding using PHP

$db->set_charset("utf8") 

right after the mysqli connection.

Now it works properly.