In a unicode string, each grapheme consists of one or more code points. However, there are some code points, such as the Zero-width joiner (ZWJ), which are never a part of a grapheme. The ZWJ is, in itself, invisible. Are all of those "non-grapheme" code points always invisible?
Are all "non-grapheme" code points invisible?
112 Views Asked by at54321 At
2
There are 2 best solutions below
0

There are many joining characters which are intended to modify a base character. Whether they provide a grapheme on their own is partially an implementation detail, I expect.
- Example:
o
followed by U+0308 COMBINING DIAERESIS producesö
(the glyph in isolation is rendered by your browser as̈
) - List of all code points in this category: https://codepoints.net/search?lb=CM
Recent Unicode versions also have invisible characters which modify how a previous emoji is being rendered, famously to add e.g. a skin color trait to emojis with human figures or faces. These by definition are not graphemes in their own right, though again, rendering engines are probably free to figure out a way to represent them if they are encountered in isolation.
- Example: U+1F44B WAVING HAND SIGN followed by U+1F3FB EMOJI MODIFIER FITZPATRICK TYPE-1-2 (which in isolation renders as
) produces
- Full catalog: https://www.unicode.org/emoji/charts/full-emoji-modifiers.html
The Unicode representation of the Ogham script is notable for containing a non-invisible whitespace character. (U+1680: OGHAM SPACE MARK)
Tom Scott made an excellent YouTube video on the subject: link