The results differ with parallel summation in Java

86 Views Asked by At

I made class Sum extends RecursiveTask. The task is to calculate sum of 1 / a[i].

public class Sum extends RecursiveAction {
    int[] items;
    double result;
    int min = 100000;
    int from, to;

    Sum(int[] items, int from, int to) {
        this.items = items;
        this.from = from;
        this.to = to;
    }

    @Override
    protected void compute() {
        if (to - from <= min) {
            for (int i = from; i < to; i++) {
                result += 1d / items[i];
            }
        } else {
            var mid = (from + to) / 2;
            var left = new Sum(items, from, mid);
            var right = new Sum(items, mid, to);
            invokeAll(left, right);
            result += left.result + right.result;
        }
    }
}

Results:

Single:   1.3180710500108106E8
Total time: 0.612
Parallel: 1.3180710501986596E8
Total time: 0.18 

The numbers are very close and differ by a small accuracy. What could this be related to? I noticed that if you remove 1 / a[i], it will be calculated correctly

1

There are 1 best solutions below

0
On

I'm guessing that you might be trying to sum a list of millions of numbers. You wrote a multi-threaded, divide-and-conquer routine that stops dividing when the list gets to be less than 100,000 long (int min=100000;) so if it's worth splitting the list in to chunks of that size, there must be at least a few of those chunks, right?

So, here's the issue: Let's say you want to add up a million numbers that are around the same order of magnitude. Maybe they're all readings from the same sensor. Let's say that the arithmetic mean of the whole list is X. If were to simply run down that list, from beginning to end, accumulating numbers,...

  • ...The expected value of the first sum is X+X,
  • ...of the next sum, X+2X
  • ...
  • ...of the last sum, X+999999X

OK, but 999999X is six orders of magnitude greater than X. In binary floating point, the exponent of 999999X is going to be greater than the exponent of X by about 20. That is to say, the binary value of 999999X is approximately the value of X shifted left by 20 bits.

In order to do the addition both numbers must have the same exponent, and the way that is accomplished is to denormalize X. If you shift the mantissa of X to the right by 20 bits, and then you add 20 to its exponent, it should, in theory, still represent the same number. Only problem is, you've just shifted away the 20 least-significant bits.

If you're using double, the original X had 54 bits of precision, but the de-normalized X that you can use in the addition only has 34 bits. If you're using float,* the original X had 23 bits, and the denormalized X has only three bits of precision.


Your goal in writing your "divide-and-conquer" algorithm was to break the problem into tasks that could be given to different threads. But a side-effect was, you also got a more accurate answer. More accurate because, for each chunk, the last step is to compute X + 99999X. The exponent mismatch there is only 16 or 17 bits instead of 19 or 20 bits. You threw away three fewer bits of precision.

To get the best possible precision, start by sorting the list—smallest numbers first. (NOTE: smallest means, closest to zero—least absolute value.) Then, remove the first two numbers from the list, add them, and insert the sum back into the list, in the correct place to keep the list sorted. (those inserts go a lot faster if you use a linked list.) Finally, repeat those steps until the list contains only one number, and that's the most accurate sum you can get without using a wider data type.


Wider data type!! That's what you really want. If you can accumulate your sum in an IEEE Quad float, you've got 112 bits of precision to work with. Adding up a million numbers? Lost 20 bits of precision? No problem! The 92 bits you've got at the end still is more than the 54 bits in the doubles that you started with. You could literally add lists of a trillion numbers before you started to lose precision as compared to double floats.

Using a wider data type if you've got one, will give you far better performance than the crazy sorted-list algorithm that I gave you above.


* Don't do math on float values. The only use for float is to save space in huge binary files and huge arrays. If you've got an array of float and you want to do math on them, convert to double, do the math, and convert back to float when you're done.