Strlen function giving wrong length when there are non-english characters in string

153 Views Asked by Bover At 26 June 2025 at 06:19

I have a program that accepts non-english characters also as an input field. Because we use strlen, it has failed to give expected length while calculating the length of the string when there is a non-english character. For input nova, output is 4 whereas for input ñova, the output is 5 whereas the output should be 4.

strlen("nova") = 4
strlen("ñova") = 5

In the 2nd case, I would expect the output as 4 instead.

Original Q&A

There are 1 best solutions below

Toby Speight On 21 December 2023 at 10:02

Remember that strlen returns the count of char in the string, which is not necessarily the same as the number of visible glyphs when it's printed.

The result will depend on your system's character coding - with ISO-8859.1, "ñova" is the same as { 241, 111, 118, 97, 0} (length 4), but if you use UTF-8, for example, then ñ is a multi-byte character and the string is represented as {195, 177, 111, 118, 97, 0} (length 5).

If you want to count the number of codepoints, then you probably want to be using mbrlen() instead of strlen(). If you want to count the number of "user" characters, taking account of combining accents and the like, then you really need a character-handling library such as ICU.

Strlen function giving wrong length when there are non-english characters in string

There are 1 best solutions below

Related Questions in C

Related Questions in ENCODING

Related Questions in STRING-LENGTH

Related Questions in STRLEN

Related Questions in NON-ENGLISH

Trending Questions

Popular # Hahtags

Popular Questions