Why does NFKC normalization lose superscript & subscript info?

161 Views Asked by codesniffer At 19 October 2025 at 02:51

I notice that when normalizing a Unicode string to NFKC form, superscript characters like ¹ (U+00B9), ² (U+00B2), ³ (U+00B3), etc are converted to the corresponding ASCII digit (ex. 1, 2, 3, etc).

Does anyone know the rationale for this behavior? It seems like it's losing information in the process. For example, a superscript number usually has some contextual meaning.

Original Q&A

Why does NFKC normalization lose superscript & subscript info?

There are 0 best solutions below

Related Questions in UNICODE

Related Questions in TEXT-NORMALIZATION

Trending Questions

Popular # Hahtags

Popular Questions