Why does unicode multiple characters representing the same letter?

484 Views Asked by At

ASCII has versions of the whole Roman alphabet. I was surprised recently to learn that Unicode contains other version/s of those same characters. One example is "U+1D5C4: MATHEMATICAL SANS-SERIF SMALL K", or "".

Can't LaTeX math mode, or MS Word equation editor, or whatever other program just use a sans-serif font if it wants the letters in a mathematical formula to be sans-serif?

2

There are 2 best solutions below

2
一二三 On BEST ANSWER

These characters exist so that the semantic distinction between them can be encoded in plain text, or where the specific font shape can't be controlled.

The block you mention is only intended for use in mathematical and technical contexts, where the distinction between, say, as a variable vs. d as a differential operator vs. as an object (in category theory) is important. TR #25 gives another example where losing the distinction between and H can completely change the meaning of an equation. Being able to encode this formatting into the text itself is also important for ISO 31-11.

All of these characters maintain compatibility mappings with their "normal" Latin and Greek counterparts, so the distinction between them should not affect searching and sorting.

2
RedX On

You are confusing the display mode with the encoding for texts.

The idea is that unicode has ALL the symbols used to write known to mankind grouped by usage. That's why you will find many code-points that look alike.

So a formula with a k is different is supposed to be different then a word written with a k. The sans-serif part is just a description of the kind of k best used to display. Tomorrow somebody might want to add a serif k and then how would you describe the difference?