The code below shows problems with with OpenMP tasking in ICL 2021.6.0 and in ICX 2022.1.0 (Clang based) Firstly, I am wondering if I am doing something fundamentally wrong in my OpenMP code and it is just showing up differently when compiled by different compilers. Assuming the code is valid OpenMP... When the function fails_intel_icl() runs under ICL, the task execution is just wrong. Some task are run twice, some not at all. Compiled by ICX/Clang it executes as I expect. When crash_icx_2022() is compiled under ICX it just crashes at runtime. I am testing using Visual Studio 20222/Debug/x64 and latest OneAPI Base and HPC installation.
Examples of incorrect runtime behaviour of the function fails_intel_icl() when compiled with ICL is as follows
Thread:12 launching task for 0,1 <--- you will note the task for pair 0,1 never runs.
Thread:12 launching task for 0,2
Thread:9 Executing task with pair 0,2 ....
#include <iostream>
#include <vector>
#include <omp.h>
std::vector<std::pair<int, std::vector<int>>> data;
void setup()
{
std::vector<int> tmp({ 1,2,3,4,5 });
for (int i = 0; i < 5; i++)
{
data.push_back({ i,tmp });
}
}
void DoTask(int a, int b)
{
{
#pragma omp critical
std::cout << "Thread:" << omp_get_thread_num() << " Executing task with pair " << a << ',' << b << std::endl;
}
}
// runs correctly under icl, but crashes at runtime with icx and clang
void crash_icx_2022()
{
# pragma omp parallel
{
# pragma omp single
{
for (auto iter = data.begin(); iter != data.end(); ++iter)
{
const auto& a = iter->first;
const auto& b = iter->second;
for (const auto& aa : b)
{
if (aa != a)
{
{
#pragma omp critical
std::cout << "Thread:" << omp_get_thread_num() << " launching task for " << ' ' << a << ',' << aa << std::endl;
}
# pragma omp task
{
DoTask(a, aa);
}
}
}
}
}
}
}
// this compiles and runs incorrectly under icl but runs correctly with icx or clang
void fails_intel_icl()
{
# pragma omp parallel
{
# pragma omp single
{
for (auto iter = data.begin(); iter != data.end(); ++iter)
{
const auto a = iter->first;
const auto b = iter->second;
for (const auto aa : b)
{
if (aa != a)
{
{
#pragma omp critical
std::cout << "Thread:" << omp_get_thread_num() << " launching task for " << ' ' << a << ',' << aa << std::endl;
}
# pragma omp task
{
DoTask(a, aa);
}
}
}
}
}
}
}
void testTaskingBug()
{
setup();
std::cout << "\nStarting test using copies\n" << std::endl;
fails_intel_icl();
std::cout << "\nStarting test using references" << std::endl;
crash_icx_2022();
}
int main()
{
testTaskingBug();
return 0;
}
The following C++17 code will not compile under clang. Not sure if the error is real.
void clang_wont_compile()
{
# pragma omp parallel
{
# pragma omp single
{
for (const auto& [a, b] : data)
{
for (const auto& aa : b)
{
if (aa != a)
{
# pragma omp task
DoTask(a, aa);
}
}
}
}
}
}
thanks for pointing this out. It does look like it should be valid OMP code. Maybe something on the backend with the task + critical which is throwing off the compiler and/or if it was not allowed per the spec but doesn’t seem to be the case.
Double checking with some OpenMP folks to see if we have a bug on this (or a better explanation as to the behavior).