Consider the letters in the picture below.
The first row shows the letters themselves, the second row numbers them and third row shows their unicode code point encoded as three hex UFT-8 bytes. For example, the letter 2 is DEVANAGARI LETTER MA with code point 0x92E (= 2350 decimal)
, which is encoded as three hex UTF-8 bytes: e0, a4, ae
.
My question is regarding the rendering of the specific connected letter such as (1). How is this rendering handled by the rendering system? The way we typically input this connected letter is by first entering letter 2, then letter 4 (indicating our intent to join this letter with the next one) and then letter 3. Then, the rendering system respects the joining action by erasing the vertical line in the letter 2 and overlaying the letter 4 right there. It is not clear to me that the font for both complete letter 2 and its vertical-line-erased-half (shown with the faint red oval) is available in the chosen font.
Can someone explain how this works?
Font files are more than a bunch of shapes for each letter. They contain various tables that dictate how glyphs behave.
There are:
See also: https://fontforge.github.io/gposgsub.html
Which font features are needed is depending on the writing system (Latin, Cyrillic, Arabic, Devanagari) and how their glyphs ought to behave. What tables are used is depending on the font designer the font file type (what is designed and what can be stored). What features are displayed is depending on the font renderer (sometimes font instructions are ignored by the renderer).
Back to your question. It is a substitution. What exactly happens is described by the information in the tables in the font file itself. If you really want to know what happens you have to open the font in an editor and inspect the various tables. I suggest to use FontForge (free and gratis).
The moral of the story is that font files are not only aesthetic letter shapes but pieces of software.