Custom macro logic

84 Views Asked by At

I was doing some exercises online and I was given this code:

#include <stdio.h>

#define CUSTOM_ADD(x) ++x+++x

int main() {
    int a=6;
    int b=8;
    
    printf("%d",a + CUSTOM_ADD(a+b)+b);
    return 0;
}

I figured out that the addition a+b(14) is passed on the macro and it opens like that ++14+ ++14 =30 then the final result should be 44. I also considered that the macro might expand like that: ++14++ +14 Which would be 29, since the value would be incremented after the usage and the final result could be 43. I was not sure which one would be right, so I ran the code and the output was 47. I just could not find anyway to reach the result.

2

There are 2 best solutions below

4
Eric Postpischil On

Macro processing works by replacing grammatical tokens.

In #define CUSTOM_ADD(x) ++x+++x, the replacement list is parsed to the tokens ++ x ++ + x. When determining the next token, the rule is to take the longest sequence of characters that could constitute a token, so analyzing +++ produces ++ first, then +.

In printf("%d",a + CUSTOM_ADD(a+b)+b);, macro replacement results in printf("%d",a + ++ a + b ++ + a + b + b);.

The second argument to printf is parsed as a + (++a) + (b++) + a + b + b. The behavior of this code is not defined by the C standard because it both modifies a, in ++a, and uses the value of a without sequencing between them. The same problem exists with b++ and b.

In ordinary modern C implementations, it is not uncommon that the above expression would produce a value of a+a+a+b+b+b as if a and b had been incremented somewhere in the sequence of additions, with a incremented at least once before its last use. Thus, with a starting at 6 and b starting at 8, it could produce 42 (6+6+7+8+8+8, with a incremented after the first addition of 6+6 and b incremented after all of them) or 48 (7+7+7+9+9+9) or anything in between. However, another potential result is that compiler analysis determines the behavior of the code is not defined by the C standard, and therefore a well-written program would never transfer program control to this execution path, so it may be removed from the program entirely.

One reason the rule exists is that, for some types, updating a variable requires multiple operations, such as for wide integers implemented using multiple operations on small processors. And a compiler could not always tell whether an expression was both updating a variable and using the same variable, as when a pointer is used. For example, in ++*p + *q, the compiler generally does not know whether p and q point to the same thing. If they did, and multiple steps are needed to increment the variable, the compiler could generate instructions that are incrementing *p interleaved with instructions that are fetching its value for *q. The C standard committee decided to put the onus on programmers to avoid this.

3
Vlad from Moscow On

Just insert the expression a + b used in this macro invocation

CUSTOM_ADD(a+b)

in the defined macro instead of the name x

#define CUSTOM_ADD(x) ++x+++x

and you will get

++a + b++ +a + b

As a result this statement

printf("%d",a + CUSTOM_ADD(a+b)+b);

will look like

printf("%d",a + ++a + b++ +a + b + b);

that invokes undefined behavior because the variables a and b are read and changed within one expression.

That is expressions like that

a + ++a

and

b++ + b

have undefined behavior.

From the C Standard (J.2 Undefined behavior)

— A side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object (6.5).