The Unicode standard defines a grapheme cluster as an algorithmic approximation to a "user-perceived character". A grapheme cluster more or less corresponds to what people think of as a single "character" in text. Therefore it is a natural and important requirement in programming to be able to operate on strings as sequences of grapheme clusters.
The best general-purpose grapheme cluster definition is the extended grapheme cluster; there are other grapheme cluster algorithms (a tailored grapheme cluster) meant for specific localized usages.
In Crystal, how can I iterate over (or otherwise operate on) a String
as a sequence of grapheme clusters?
This answer is based on a thread in the Crystal forum.
Crystal does not have a built-in way to do this (unfortunately) as of 1.0.0.
However, the regex engine in Crystal does, with the
\X
pattern which matches a single extended grapheme cluster:Run it online
You can wrap this up in a nicer API as follows:
Run it online