PHP read Japanese character, transform Japanese kanji into readable form

1.5k Views Asked by At

I'm not sure if PHP is capable of this but,

I've got Japanese kanji characters 『漢字』being displayed. I'd like php (or some language) to read this character and display how to read it (either in katakana「かんじ」or romaji「kanji」)

This way I will be able to display characters like this.

kanji
かんじ
漢字

Basically, add furigana to kanji (how to read the character).

1

There are 1 best solutions below

0
On BEST ANSWER

This is not a trivial problem.

Consider the problems that will arise from verb conjugations (送りがな) as well as 音読み and 訓読み. How does PHP know the difference in reading between '食' in '食事' and '食べる'?

You need a morphological analyzer for this, such as mecab.

If you install mecab on your server, you can call it from php via exec.

$key='漢字';
$phonetic=exec( 'echo '.$key.' | mecab  -O yomi');

*note that yomi allows for phonetic reading to be displayed in katakana

To prevent encoding issues, you might want to run something like putenv('LANG=en_US.UTF-8'); prior to the exec so that the stdout is not garbled when stored in a variable in php.

Even something like mecab cannot give you 100% accuracy due to the complex nature of Japanese sentences.