Edit 2:
const tamilRegex = XRegExp("\\p{Tamil}", "ug")
const match = XRegExp.exec(word, tamilRegex);
return match
Now, I found XRegExp a library which can handle unicode characters. The above code is the one I tried using that library still it returns wrong value.
Any help?!
Edit 1:
const word = "யாத்திராகமம்"
const firstLetter = word.match(/[^\w]/u)
console.log(firstLetter)
The above code returns ய which is not the correct first tamil letter in that word, instead it should be யா.
Any way to get the proper first letter in a word using regex or any other library?
I don't know the Tamil script, but Wikipedia explains the concept of compound letters in that script. The Tamil Unicode Block has characters in the range U+0B80 to U+0BFF, of which the subrange U+0BBE-U+0BCD, and one at U+0BD7 are suffixes that need to be combined with the preceding consonant to make it a compound letter.
Without any specialised library or smarter regex support, it seems you can make it work with the regex
[\u0b80-\u0bff][\u0bbe-\u0bcd\u0bd7]?, which matches a character in the Tamil range, and in addition possibly one of those suffix codes.