Is INT_MAX+a-b not cause overflow but INT_MAX*a/b cause overflow (if a>1 and a=b)?

363 Views Asked by At

I want to fix some problem about overflow. I have some data use int to store, the data does not cause overflow but the calculation intermediate may cause overflow.

For example, I need to store the diagonal of square, the length of side is 50000, so the diagonal is 70710, which the side and diagonal is far smaller than INT_MAX, but for calculation, aa+bb in sqrt(aa+bb) will cause overflow.

I want to follow "just use int" rule, so I may need to cast each variable every time:

int f=(long)a+(long)b*(long)c/(long)d-(long)e;

but each time add (long) affects readability, I test which operation may cause overflow and which may have auto cast:

#include <sstream>
int main(){
    int a=rand();
    int b=a;
    printf("%d\n",a);
    printf("%d\n",INT_MAX);
    printf("\n");
    printf("%d\n",INT_MAX+a-b);
    printf("%d\n",INT_MAX-b+a);
    printf("%d\n",a+INT_MAX-b);
    printf("%d\n",a-b+INT_MAX);
    printf("%d\n",-b+a+INT_MAX);
    printf("%d\n",-b+INT_MAX+a);
    printf("\n");
    printf("%d\n",INT_MAX*a/b);
    printf("%d\n",INT_MAX/b*a);
    printf("%d\n",a*INT_MAX/b);
    printf("%d\n",a/b*INT_MAX);
    printf("\n");
    printf("%ld\n",(long)INT_MAX*a/b);
    printf("%ld\n",INT_MAX*a/(long)b);
    return 0;
}

the output is:

16807
2147483647

2147483647
2147483647
2147483647
2147483647
2147483647
2147483647

127772
2147480811
127772
2147483647

2147483647
127772

I use rand() to ensure no compile time calculation, I found for + and - the result is the same for different sequence of INT_MAX,+a and -b, but for *a and /b it is not.

Also I found even use casting, (long)INT_MAXa/b is normal but INT_MAXa/(long)b is not.

I guess for + and -, if the result is smaller than INT_MAX, it would not cause overflow even the calculation intermediate (e.g.:INT_MAX+a in INT_MAX+a-b) may cause overflow, but for * and / ,the overflow intermediate would affect the result, is it right?

Also for * and /, I guest the operation starts from left hand side, so casting need start from left hand side (e.g.:(long)INT_MAX*a/b), is it also right?

So, if my data does not cause overflow but the calculation may cause overflow, is

int f=a+b*c/d-e;

only need to rewrite as

int f=a+(long)b*c/d-e;

?

1

There are 1 best solutions below

4
On

data does not cause overflow but the calculation intermediate may cause overflow.

To avoid int overflow, which is undefined behavior, the simplest solution is to use a wide enough integer type.

int foo1(int a, int b, int c, int d) {
  int f=(long)a+(long)b*(long)c/(long)d-(long)e; // OP's stating point, but see foo2
  return f;
}

but each time add (long) affects readability

To avoid unnecessary casting and its readability, use * one only where needed. A good compiler will optimize out the explicit multiplication yet retain the type promotion.

int foo2(int a, int b, int c, int d) {
  int f = a + 1L*b*c/d - e; // Cleaner yet see foo3
  return f;
}

To insure a potential wider type is wide enough (long may be the same width as int), perform compile time tests

// Find a type where INT_MAX*INT_MAX <= some_type_MAX
#if LONG_MAX/INT_MAX >= INT_MAX
  #define WIDE1 1L
#elif LLONG_MAX/INT_MAX >= INT_MAX
  #define WIDE1 1LL
#elif INTMAX_MAX/INT_MAX >= INT_MAX
  #define WIDE1 ((intmax_t)1)
#else
  #error Out of luck
#endif

int foo3(int a, int b, int c, int d) {
  int f = a + WIDE1*b*c/d - e;
  return f;
}

To only use type int math is work to be avoided.


.. but for calculation, aa+bb in sqrt(aa+bb) will cause overflow.

For this case

int hypoti1(int a, int b) {
  return sqrt(WIDE1*a*a + WIDE1*b*b);
}

// or simply

int hypoti1(int a, int b) {
  return hypot(a, b);
}