Confusion with #pragma omp for

88 Views Asked by At

I'm new to parallel programming (just now learning it in class), and I'm a little confused as to what exactly this directive does...I've been told (and read online) that it uses worksharing to get various threads to work through the for loop's iterations, but I don't understand what exactly this means. For example, if we have:

int i;
#pragma omp parallel for
for (i = 0; i < 10; i++){
    foo();
}

Say we have 4 threads, and we identify each iteration by its "i". Will this code basically make thread 1 do iteration 0, thread 2 do iteration 1, thread 3 do iteration 2, thread 4 do iteration 3, thread 1 (again) do iteration 4... (I know that obviously it might not be thread 1 doing iteration 0, it could very well be thread 4 doing that, but it's easier for me to think about it this way)

If this is the case, why does each thread need its own private version of "i"? Generally, why do worksharing cases need private variables?

Once again, thank you in advance for any help!!

2

There are 2 best solutions below

1
On

You are basically correct in your assessment what happens to the loop. When you do not specify the schedule clause, typical OpenMP compilers pick a static schedule, that is, the loop iterations will be split across the threads such that each thread receives a contiguous block of roughly the same number of iterations.

As you correctly state, for this to work, each thread needs its "own" private variable i. The OpenMP API specification mandates that the loop variable of a parallel loop with the for directive is automatically made private. So, in this code:

int i;
#pragma omp parallel for
for (i = 0; i < 10; i++){
    foo();
}

the variable i is per thread.

Making other variables private is useful, too:

int i;
int tmp;
#pragma omp parallel for
for (i = 0; i < 10; i++){
    tmp = foo(); /* race condition on tmp, don't use! */
    bar(tmp);
}

In this case, the code computes some result and stores it in tmp. This variable is not loop counter and so it will not be private. In that case, you will have to explicitly make the variable private:

int i;
int tmp;
#pragma omp parallel for private(tmp)
for (i = 0; i < 10; i++){
    tmp = foo();
    bar(tmp);
}

In C/C++, you also have block scopes to achieve the same thing:

int i;
#pragma omp parallel for
for (i = 0; i < 10; i++){
    int tmp;
    tmp = foo();
    bar(tmp);
}

The previous and the last code example are equivalent in that tmp is private per thread.

0
On

This code

int i;
#pragma omp parallel for
for (i = 0; i < n; i++){
    foo();
}

is actually equivalent to:

int i;
#pragma omp parallel private(i)
{
    int nt = omp_get_num_threads();
    int it = omp_get_thread_num();
    int iter_per_thread = (n-1)/nt + 1;
    int imin = it * iter_per_thread;
    int imax = (it < nt-1 ? itmin + iter_per_thread : n);
    for (i = imin; i < imax; i++){
        foo();
    }
}

Each thread executes a portion of the loop, so it definitely needs a private version of the loop index i.