I learned that in C language char type ranges from -128 to 127, but it doesn't seem like that

819 Views Asked by At

This might be a very basic problem, but I couldn't manage to. Here is what I am working with.

#include <stdio.h>

int main(void)
{
    char c1, c2;
    int s;
    c1 = 128;
    c2 = -128;

    s = sizeof(char);

    printf("size of char: %d\n", s);
    printf("c1: %x, c2: %x\n", c1, c2);
    printf("true or false: %d\n", c1 == c2);
}

The result is like this.

size of char: 1
c1: ffffff80, c2: ffffff80
true or false: 1

i assigned the value 128 to signed(normal) char type, but it didn't overflow.

In additon, c1 and c2 both seems to hold 4bytes, and -128 and 128 are the same value.

How can I understand these facts? I need your help. Thank you very much.

4

There are 4 best solutions below

3
On BEST ANSWER

In c1 = 128;, 128 does not fit in the signed eight-bit char that your C implementation uses. 128 is converted to char per C 2018 6.5.16.1 2: “the value of the right operand is converted to the type of the assignment expression…”

The conversion is implementation-defined, per 6.3.1.3 3: “Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.” Your C implementation converted 128, which is 100000002 as an unsigned binary numeral, to −128, which is represented with the same bits when using two’s complement for signed binary. Thus, the result is that c1 contains the value −128.

In printf("c1: %x, c2: %x\n", c1, c2);, c1 is converted to an int. This is because the rules for calling functions with ... parameters are to apply the default argument promotions to the corresponding arguments, per 6.5.2.2 7: “The default argument promotions are performed on trailing arguments.”

The default argument promotions include the integer promotions, per 6.5.2.2 6. When the range of char is narrower than int, as it is in most C implementations, the integer promotions convert a char to an int, per 6.3.1.1 2: “If an int can represent all values of the original type…, the value is converted to an int…”

Thus, in printf("c1: %x, c2: %x\n", c1, c2);, an int value of −128 is passed as the second argument. Your C implementation uses 32-bit two’s complement for int, in which −128 is represented with the bits 11111111111111111111111110000000, which we can express in hexadecimal as ffffff80.

The format string specifies a conversion using %x. The proper argument type for %x is unsigned int. However, your C implementation has accepted the int and reinterpreted its bits as an unsigned int. Thus, the bits 11111111111111111111111110000000 are converted to the string “ffffff80”.

This explains why “ffffff80” is printed. It is not because c1 has four bytes but because it was converted to a four-byte type before being passed to printf. Further, the conversion of a negative value to that four-byte type resulted in four bytes with many bits set.

Regarding c1 == c2 evaluating to true (1), this is simply because c1 was given the value −128 as explained above, and c2 = -128; also assigns the value −128 to c2, so c1 and c2 have the same value.

1
On

In the statement

printf("c1: %x, c2: %x\n", c1, c2);

%x expects an argument of type unsigned int, so the values of c1 and c2 are being promoted from char to unsigned int, with the leading bit extended. To print the numeric value of an unsigned char as hex, you need to use the hh length modifier in the conversion:

printf("c1: %hhx, c2: %hhx\n", c1, c2 );

As for the values that can be represented in a char, it's a little more complicated than that.

The encodings for members of the basic character set1 are guaranteed to be non-negative. Encodings for additional characters may be negative or non-negative.

Thus, depending on the implementation, a plain char may represent values in at least the range [-128..127] (assuming two's complement representation) or [0..255]. I say "at least" since CHAR_BIT may be more than 8 (there are historical systems that used 9-bit bytes and 36-bit words). A signed char will represent values in at least the range [-128..127] (again, assuming two's complement).

Assuming char is signed and 8 bits, then assigning 128 to c1 leads to signed integer overflow and the behavior on that is undefined, meaning the compiler and execution environment aren't required to handle it in any particular way. Any result is "correct" as far as the language definition is required, whether it's the result you expected or not.


  1. Upper- and lowercase Latin alphabet, decimal digits, 29 graphical characters, whitespace and control characters (line feed, form feed, tab, etc.).

2
On

The type char can behave as the type signed char or as the type unsigned char depending on a compiler option or default settings of the compiler.

In your case the type char behaves as the type signed char. In this case CHAR_MIN is equal to -128 and CHAR_MAX is equal to 127.

So an object of the type char can not hold the positive number 128. Internally this value has the following hexadecimal representation 0x80. So stored in an object of the type char it is interpreted as a negative value because the sign bit is set. This negative value is -128.

So after these statements

c1 = 128;
c2 = -128;

the both objects have the same value equal to -128.

And the output

c1: ffffff80, c2: ffffff80

of this call

printf("c1: %x, c2: %x\n", c1, c2);

shows that the both objects c1 and c2 promoted to the type int have the same representation of a negative value.

Pay attention to that it is implementation-defined behavior to assign an object of the signed type with a positive value that can not be represented in the object.

0
On

Here its explained: https://en.wikipedia.org/wiki/Signed_number_representations

If -128 and 128 and all numbers in between were represented with a byte, we would have 257 numbers in that set. We however dont, its just 256.

Its mapped as follows decimal: [0..127,-128..-1] => [0b00000000..0b11111111]. Note that the first bit becomes 1 at -128, happy accident ;).

Also your string formatting is incorrect, your compiler should warn you, %x expects 4 bytes! If you take into account what I said earlier then you see that 0x80 is indeed 0b10000000.