0

I struggle a bit with a function. The calculation is wrong if I try to parallelize the outer loop with a

#pragma omp parallel reduction(+:det). 

Can someone show me how to solve it and why it is failing?

// template<class T> using vector2D = std::vector<std::vector<T>>;

float Det(vector2DF &a, int n)
{
  vector2DF m(n - 1, vector1DF(n - 1, 0));

  if (n == 1) return a[0][0];
  if (n == 2) return a[0][0] * a[1][1] - a[1][0] * a[0][1];

  float det = 0;
  for (int i = 0; i < n; i++)
  {
    int l = 0;
#pragma omp parallel for private(l)
    for (int j = 1; j < n; j++)
    {
      l = 0;
      for (int k = 0; k < n; k++)
      {
        if (k == i) continue;
        m[j - 1][l] = a[j][k];
        l++;
      }
    }
    det += std::pow(-1.0, 1.0 + i + 1.0) * a[0][i] * Det(m, n - 1);
  }

  return det;
}
4
  • 1
    The calculation is wrong And, what is expected output? What is actual output? What steps did you take to try and resolve the problem yourself? Commented May 19, 2017 at 13:50
  • This was the original version :/ pastebin.com/ZJFjAY5T Commented May 19, 2017 at 13:58
  • All information, necessary to answer your question (or solve your problem) should be present in the question itself, and not some external links, that may expire at an unspecified time. For that reason - I refuse to follow external links. Commented May 19, 2017 at 14:03
  • 1
    Don't write std::pow(-1.0, 1.0 + i + 1.0) * ... which is btw the same as std::pow(-1.0, i) * .... Do this with something like (i%2 == 0 ? 1.0 : -1.0) * ... or even so: (i%2 == 0 ? det += ... : det -= ...). Commented May 19, 2017 at 14:16

1 Answer 1

2

If you parallelize the outer loop, there is a race condition on this line:

m[j - 1][l] = a[j][k];

Also you likely want a parallel for reduction instead of just a parallel reduction.

The issue is, that m is shared, even though that wouldn't be necessary given that it is completely overwritten in the inner loop. Always declare variables as locally as possible, this avoids issues with wrongly shared variables, e.g.:

float Det(vector2DF &a, int n)
{
  if (n == 1) return a[0][0];
  if (n == 2) return a[0][0] * a[1][1] - a[1][0] * a[0][1];

  float det = 0;
  #pragma omp parallel reduction(+:det)
  for (int i = 0; i < n; i++)
  {
    vector2DF m(n - 1, vector1DF(n - 1, 0));
    for (int j = 1; j < n; j++)
    {
      int l = 0;
      for (int k = 0; k < n; k++)
      {
        if (k == i) continue;
        m[j - 1][l] = a[j][k];
        l++;
      }
    }
    det += std::pow(-1.0, 1.0 + i + 1.0) * a[0][i] * Det(m, n - 1);
  }
  return det;
}

Now that is correct, but since m can be expensive to allocate, performance could benefit from not doing it in each and every iteration. This can be done by splitting parallel and for directives as such:

float Det(vector2DF &a, int n)
{
  if (n == 1) return a[0][0];
  if (n == 2) return a[0][0] * a[1][1] - a[1][0] * a[0][1];

  float det = 0;
  #pragma omp parallel reduction(+:det)
  {
    vector2DF m(n - 1, vector1DF(n - 1, 0));
    #pragma omp parallel for
    for (int i = 0; i < n; i++)
    {
      for (int j = 1; j < n; j++)
      {
        int l = 0;
        for (int k = 0; k < n; k++)
        {
          if (k == i) continue;
          m[j - 1][l] = a[j][k];
          l++;
        }
      }
      det += std::pow(-1.0, 1.0 + i + 1.0) * a[0][i] * Det(m, n - 1);
    }
  }
  return det;
}

Now you could also just declare m as firstprivate, but that would assume that the copy constructor makes a completely independent deep-copy and thus make the code more difficult to reason about.

Please be aware that you should always include expected output, actual output and a minimal complete and verifiable example.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.