The code:
val plainText = "plainText"
val plainTextWithEmoji = "plainText"
println("plainText=$plainText, length=${plainText.length}")
println("plainTextWithEmoji=$plainText, length=${plainTextWithEmoji.length}")
// Output:
// plainText=plainText, length=9
// plainTextWithEmoji=plainText, length=15
This code imply that emoji character's length is 2, not 1.
When I want to remove the last character's:
If I call plainText.subSequence(0, plainTextWithEmoji.length - 1)
, the result is wrong, because emoji character length is more than 1.
To call subSequence and get the correct result, do this: plainText.subSequence(0, plainTextWithEmoji.length - 2)
But in general, We can not know if the last character's length is 1. When we want to remove the last character, simply call charSequence.subSequence(0, charSequence.length - 1)
will return a wrong result.
So, it is any way to remove last grapheme of CharSequence? Thx!
Finally, I find the solution inspired by this post. Since UTF-8 is variable length, to call
CharSequence.subSequence
and get correct result, we can get every grapheme's start index in this sentence by magicBreakIterator
:Example: