We have a requirement to transliterate Arabic text to Latin characters(without diacritical marks) and display them to users.
We are currently using IBM ICU4j for this. The API doesn't trasliterate well the Arabic text into proper readable latin characters. Refer the below examples:
Example
Arabic text :
صدام حسين التكريتي
Google's transliteration output
:
Sadaam Hussein al-tikriti
ICU4J's transliteration outuput
:
ṣdạm ḥsyn ạltkryty
How can we improve the transliterated output of ICU4j library?
ICU4J gives us an option to write our own rules but we are currently stuck as no one from our team knows Arabic and are unable to find any proper standard that can be followed.
It's took 4 hours me to research out any other source to tackle out this problem.Later i tried ICU4J and find the solution for your problem .You can run the code and see the point which you was missing.
Just checkout the answer and verify on your own.As the output you receive will be exactly as shown below.