Problems when using Chapel 1.19 along with GASNet PSM (OmniPath) substrate

76 Views Asked by At

After Changing to version 1.19, but using Omnipath implementation, I'm randomly receiving the following error: ERROR calling: gasnet_barrier_try(id, 0).

I know that the Omnipath implementation of GASNet is no longer supported by the current version of Chapel. However, I would like to use some features available only in version 1.19, and the cluster I use runs over an Omnipath network.

In order to use the PSM substrate (OmniPath), I proceed as suggested by Chapel's Gitter community:

export CHPL_GASNET_ALLOW_BAD_SUBSTRATE=true

wget https://gasnet.lbl.gov/download/GASNet-1.32.0.tar.gz

tar xzf GASNet-1.32.0.tar.gz

rm -rf $CHPL_HOME/third-party/gasnet/gasnet-src

mv GASNet-1.32.0 $CHPL_HOME/third-party/gasnet/gasnet-src

Then, I setup other variables:

export CHPL_COMM='gasnet' export CHPL_LAUNCHER='gasnetrun_psm' export CHPL_COMM_SUBSTRATE='psm' export CHPL_GASNET_SEGMENT='everything' export CHPL_TARGET_CPU='native' export GASNET_PSM_SPAWNER='ssh' export HFI_NO_CPUAFFINITY=1

Next, I build the runtime, etc.

However, when I run experiments, I randomly receive the following error:

ERROR calling: gasnet_barrier_try(id, 0) at: comm-gasnet.c:1020 error: GASNET_ERR_BARRIER_MISMATCH (Barrier id's mismatched)

Which finishes the execution of the program.

I cannot find in GASNet documentation the reason for this error. I could only find a bit of information on GASNet's code.

Do you know what's the cause of this problem?

Thank you all.

1

There are 1 best solutions below

1
On BEST ANSWER

I realize this is an old question, but for the record the current version of Chapel (1.28.0) now embeds a version of GASNet (GASNet-EX 2022.3.0 as of this writing) that provides CHPL_COMM=gasnet CHPL_COMM_SUBSTRATE=ofi (aka GASNet ofi-conduit) that provides high-quality support for Intel Omni-Path.

In particular, there should no longer be any reason to clobber Chapel's embedded version of GASNet-EX with an ancient/outdated GASNet-1 to get Omni-Path support, as suggested in the original question.

For more details see Chapel's detailed Omni-Path instructions.