"Program received signal SIGSEGV: Segmentation fault - invalid memory reference." when using large-size array and MPI_BARRIER

310 Views Asked by At

I used Fortran with MPI (CRAY's compiler) for my code. I used 512 cores, and I found that as my variable exceeds certain size, the code crashed at MPI_BARRIER, and the error message is

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
.
.
.

One possibly useful information is that I print out a tag (i.e., write(,) "tag") before conducting MPI_BARRIER, and I found that the number of outputted tags (426) plus the number of the repeated error messages (86) is equal to the cores I used (512).

I think this is memory issue. I use slurm to submit my job, and I remember I've tried something like "ulimit -s unlimited" (couldn't find the web now...), but I haven't been able to solve this problem.

0

There are 0 best solutions below