I have Arabic content like ضضضضضضض. I want to get the Unicode code points of all forms of the letters (initial, medial, final or isolated) in the given string.

1

There are 1 best solutions below

0
On BEST ANSWER

A Javascript library (not mine) can do this for you: https://github.com/louy/Javascript-Arabic-Reshaper

This will take a string that uses only the 'generic' characters and return you a new string with all the correct position-specific replacements done for you. From there, you can just grab the character code (or code point) at each position.

Here is a sample usage:

//import the library
var ArabicReshaper = require('arabic-reshaper');

// This can be a plain string. I just want to make sure I am feeding
// it the "plain" letter, not the initial/middle/end forms
var originalString = String.fromCharCode(0x0636, 0x0636); //ضض

// this will convert it to the 'shaped' letters. that means the letters
// will be transformed into the 'initial/middle/end' forms in the string
// (not just when it draws to the screen.
var newString = ArabicReshaper.convertArabic(originalString);

// And get the values. These will be the specific initial/middle/end values, not the generic ones
console.log(
    newString.codePointAt(0).toString(16), // outputs febf
    newString.codePointAt(1).toString(16) // outputs febe
);