I have a regular expression to get the initials of a name like below:
/\b\p{L}\./gu
it works fine with English and other languages until there are graphemes and combined charecters occur.
Like
क in Hindi and
ಕ in Kannada
are being matched
But,
के this one in Hindi,
ಕೆ this one in Kannada are notmatched with this regex.
I am trying to get the initials from a name like J.P.Morgan, etc.
Any help would be greatly appreciated.
You need to match diacritic marks after base letters using
\p{M}*:The pattern matches
\b- a word boundary(?<!\p{M})- the char before the current position must not be a diacritic char (without it, a match can occur within a single word)\p{L}- any base Unicode letter\p{M}*- 0+ diacritic marks\.- a dot.See the PHP demo online: