Is the set of distinct graphemes infinite?

136 Views Asked by At

Is there any limit to the number of distinct graphemes that can be represented with a Unicode encoding such as UTF-8? Does, for example, the Unicode standard restrict the number of consecutive combining characters?

1

There are 1 best solutions below

2
On

The set of possible combinations of a character and combining marks after it is infinite (though only countably infinite ☺). The Unicode Standard says explicitly in clause 2.1 (in chapter 2): “All combining characters can be applied to any base character and can, in principle, be used with any script.” A combination of a letter and a diacritic can be used as a base character for another diacritic, and so on.

At a higher protocol level, as in a data format specification, you can of course impose limit e.g. on the number of consecutive combining marks. The Unicode Standard, however, does not set such restrictions.