I'm not sure if PHP is capable of this but,
I've got Japanese kanji characters 『漢字』being displayed. I'd like php (or some language) to read this character and display how to read it (either in katakana「かんじ」or romaji「kanji」)
This way I will be able to display characters like this.
kanji
かんじ
漢字
Basically, add furigana to kanji (how to read the character).
This is not a trivial problem.
Consider the problems that will arise from verb conjugations (送りがな) as well as 音読み and 訓読み. How does PHP know the difference in reading between '食' in '食事' and '食べる'?
You need a morphological analyzer for this, such as
mecab
.If you install
mecab
on your server, you can call it from php viaexec
.*note that yomi allows for phonetic reading to be displayed in katakana
To prevent encoding issues, you might want to run something like
putenv('LANG=en_US.UTF-8');
prior to theexec
so that thestdout
is not garbled when stored in a variable in php.Even something like mecab cannot give you 100% accuracy due to the complex nature of Japanese sentences.