OpenMP - Parallelizing nested loop

Question

I'm having issues with trying to recode a nested for loop in order to parallelize it:

for(i=0; i<n; i++)
{
    for(j=0; j<n; j++)
    {
        if(asubsref(struct1,j) > 0)
            asubsref(struct2,j) = asubsref(struct3,j) + 1;
    }
    for(j=0; j<n; j++)
        asubsref(struct1,j) = asubsref(struct2,j) - asubsref(struct3,i);
}

Struct1/struct2 are two structs with a width/height/int-float array respectively. struct3 is a float struct.

My attempt so far was to make them into two different loops but alas, it didn't work as I'd get a lot of incorrect results:

#pragma omp parallel
{
#pragma omp for private(j)
   for(i=0; i<n; i++)
   {
     for(j=0; j<n; j++)
     {
       if(asubsref(struct1,j) > 0)
         asubsref(struct2,j) += 1;
     }
   }
#pragma omp for private(j)
   for(i=0; i<n; i++)
   {
     k = asubsref(struct3,i);
     for (j=0; j<n; j++)
     {
       asubsref(struct1,j) -= k;
     }
   }
}

I'm not looking for an answer but some guidance to help me thinking how to go about this/tips toward the answer and the like.

Also, are you sure that: asubsref(bin,j) = asubsref(bin,j) + 1; and asubsref(bin,j) += 1; are identical statements? If asubsref has side effects, this may not be true. — ArjunShankar
– ArjunShankar, Commented May 23, 2012 at 10:26
What I mean is: Are the results 'correct' when you compile the second version of your code without OpenMP? — ArjunShankar
– ArjunShankar, Commented May 23, 2012 at 10:27
is there a specific reason why you want to parallelize the nested loop and not the outer-most one? — Anton Pegushin
– Anton Pegushin, Commented May 23, 2012 at 11:03
@AntonPegushin - It is the outermost loop that is being parallelized. But it has been split into two loops. This of course, is questionable (depends on side effect of all the calls to asubsref) — ArjunShankar
– ArjunShankar, Commented May 23, 2012 at 11:53
Sorry, as it turns out my code in the second half was completely wrong, should've checked whether it worked without the pragmas first! Also, I wanted to parallelize the outer loops, not the inner ones. Thanks for the help! — Feels Code Man
– Feels Code Man, Commented May 23, 2012 at 17:54

Dima Chubarov · Accepted Answer · 2012-05-23 12:38:15Z

What I see in this code is three arrays:

array1: asubsref(seed,0) ... asubsref(seed,n-1)
array2: asubsref(bin,0) ... asubsref(bin,n-1)
array3: asubsref(w,0) ... asubsref(w,n-1)

If this assumption is correct and asubsref does not produce any side effects, the following invariant can be derived:

After the end of the execution of the loop, array2[j] is incremented by the number x which is the largest number such that sum of array3[i] for i from 0 to x is smaller than array1[j].

Here is what you can do. First, you can merge the two innermost loops since (under our assumption) their iterations are independent:

for(i=0; i<n; i++)
{
    for(j=0; j<n; j++)
    {
        if(asubsref(seed,j) > 0)
            asubsref(bin,j) = asubsref(bin,j) + 1;
        asubsref(seed,j) = asubsref(seed,j) - asubsref(w,i);
    }
}

Then interchange the innermost and outermost loops

for(j=0; j<n; j++)
{
   for(i=0; i<n; i++)
   {
        if(asubsref(seed,j) > 0)
            asubsref(bin,j) = asubsref(bin,j) + 1;
        asubsref(seed,j) = asubsref(seed,j) - asubsref(w,i);
   }
}

Now it is clear that the following code should work

#pragma omp parallel for (private i)
for(j=0; j<n; j++)
{
   for(i=0; i<n; i++)
   {
        if(asubsref(seed,j) > 0)
            asubsref(bin,j) = asubsref(bin,j) + 1;
        asubsref(seed,j) = asubsref(seed,j) - asubsref(w,i);
   }
}

while splitting the loops clearly breaks the invariant.

Collectives™ on Stack Overflow

OpenMP - Parallelizing nested loop

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related