Calling OpenACC code from Rust does not run on GPU

418 Views Asked by Sanhu Li At 05 September 2020 at 21:22

Update 09/07/2020

I'm adding a small example for this at https://gitlab.com/lisanhu2016/rust-openacc-example.git

It's a public repository with a README, you should be able to try the example there.

The libraries I have been linking to are: nvc, acchost, pgm, you can see them in build.rs

===

I’m trying to call openacc code from rust code and I’m having issues probably related to linking. I’m using the following steps:

compiling bindings.cc with openacc flags -acc -gpu=managed -Minfo=accel and convert it to a static library libfoo.a

[ 87%] Building CXX object CMakeFiles/foo.dir/bindings.cc.o
/opt/nvidia/hpc/20.7/Linux_x86_64/20.7/compilers/bin/nvc++    -fast -O3 -DNDEBUG   -fPIC -acc -gpu=managed -Minfo=accel -o CMakeFiles/foo.dir/bindings.cc.o -c /usa/lisanhu/tmp/rust-c-ffi-example/bindings.cc
process_batch:
      6, Generating copyout(lengths[:array.l]) [if not already present]
         Generating implicit copyin(array.data[:]) [if not already present]
         Generating Tesla code
         10, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
      6, Generating implicit copyin(array) [if not already present]
[100%] Linking CXX static library libfoo.a

compiling rust code with libfoo.a, and dynamically link to libraries nvc nvcpumath nvhost nvdevice.
run the code with some data, it runs.
run the code with PGI_ACC_TIME=1, there's no timing info
run the code with ncu --set full, it shows no kernel information

I think we probably are linking to the wrong libraries but I'm not sure, would you please help me with this problem? Thank you so much!!

Original Q&A

There are 1 best solutions below

Mat Colgrove On 11 September 2020 at 18:52 BEST ANSWER

I worked with Sanhu via the OpenACC Slack channel on this one. There were two issues.

First since he's not linking with PGI/NV, he needs to add the "-gpu=nordc" flag. RDC requires a device link step not performed when being linked by Rust.

Second, since the "data" array is being allocated by Rust, it wont be put into CUDA Unified Memory. Hence he needed to add it to an OpenACC data region with the "process_array" routine.

Calling OpenACC code from Rust does not run on GPU

There are 1 best solutions below

Related Questions in RUST

Related Questions in OPENACC

Related Questions in PGI

Related Questions in PGI-ACCELERATOR

Trending Questions

Popular # Hahtags

Popular Questions