Why does using char agrv instead of char **argv as the argument of main cause the following output?

297 Views Asked by At

When I do this:

int main(int agrc, char argv)
{
    printf("%d", argv);
    return 0;
}

I get this input when I run the program from command line:

$ prog_name 0
0

$ prog_name (from 0-7 characters)
48

$ prog_name 12345678
56

$ prog_name 1234567812345678
64

// and so on...

So where do these values come from and why they increment by 8?

What happens when I have this instead:

int main(int agrc, char argv[])

?

3

There are 3 best solutions below

9
On

From the C standards, regarding the signature of main()

The implementation declares no prototype for this function.

So, there will be no issues from the compiler if you pass different type of arguments.

In your code,

int main(int agrc, char argv)

is not the signature recommended for main(). It should either be

int main(int agrc, char* argv[])

or, at least

int main(int agrc, char** argv)

Otherwise, in a hosted environment, the behavior in not defined. You can check more on this in C11 standard, chapter 5.1.2.2.1.

In your case, as you see, you are making the second parameter a char type. As per the standard specification,

If the value of argc is greater than zero, the array members argv[0] through argv[argc-1] inclusive shall contain pointers to strings,....

So, here, the supplied 0 is passed to main() as a pointer to string which is accepted in a char, which is not a defined behavior.

4
On

There is a string pointer on the stack, but you declared main with a char there, and then printed it as a decimal. The memory address of that string is not predictable, so you get unpredictable output.

Try this:

int main( int argc, char* argv[] )
{
    printf( "%s", argv[1] );
    return 0;
}

I think that will give you what you intended.

8
On

Your output is likely to be an address of "ordinary" argv parameter, that is implicitely converted interpretedsee comment below as char. In other words I suspect that what you have is equivalent to:

int main(int agrc, char **argv)
{
    printf("%d", (char) argv);
    return 0;
}

On my machine (CentOS 6 32-bit) disassembled object codes are as follows:

   0x080483c4 <+0>: push   %ebp
   0x080483c5 <+1>: mov    %esp,%ebp
   0x080483c7 <+3>: and    $0xfffffff0,%esp
   0x080483ca <+6>: sub    $0x10,%esp
   0x080483cd <+9>: mov    0xc(%ebp),%eax
   0x080483d0 <+12>:    movsbl %al,%eax
   0x080483d3 <+15>:    mov    %eax,0x4(%esp)
   0x080483d7 <+19>:    movl   $0x80484b4,(%esp)
   0x080483de <+26>:    call   0x80482f4 <printf@plt>

and original code that you've posted:

   0x080483c4 <+0>: push   %ebp
   0x080483c5 <+1>: mov    %esp,%ebp
   0x080483c7 <+3>: and    $0xfffffff0,%esp
   0x080483ca <+6>: sub    $0x20,%esp
   0x080483cd <+9>: mov    0xc(%ebp),%eax
   0x080483d0 <+12>:    mov    %al,0x1c(%esp)
   0x080483d4 <+16>:    movsbl 0x1c(%esp),%eax
   0x080483d9 <+21>:    mov    %eax,0x4(%esp)
   0x080483dd <+25>:    movl   $0x80484b4,(%esp)
   0x080483e4 <+32>:    call   0x80482f4 <printf@plt>

In both cases $0x80484b4 stores "%d" format specifier as string literal and 0xc(%ebp) is responsible for actual value that is used by printf():

(gdb) x/db 0xbffff324
0xbffff324: -60
(gdb) p $al
$3 = -60

Notice that AL (one byte accumulator, i.e. part of EAX) "fetches" only the first byte (my CPU is little endian, so it's actually LSB) at $ebp+0xc address. This means that (char) conversion does "cut-off" of an argv address.

As a consequence you may observe that each of these numbers have log2(n) least significant bits unset. This due to alignment requirement for objects of pointer type. Typically for a 32-bit x86 machine alignof(char **) == 4.

As already pointed in comments you violated C Standard, so it's an example of UB.