MPI_Gatherv Negative count error

1k Views Asked by At

I am writing a MPI program where I need to gather an array from every process to root process. I am using MPI_Gatherv(since arrays can have variable length) function for doing this, However, I keep getting PMPI_Gatherv(455): Negative count exception. Below is the code snippet which does this MPI_Gatherv call. I haven't posted the complete code as it is too big but I can add required parts of code if required.

double *errs;
int *rcounts, *displ;
printf("P:%d calling gather with count %d\n", p->rank, f->slice_size);
if (p->rank == 0) {
errs = (double*) malloc (sizeof(double) * NGRID);
rcounts = (int*) malloc (sizeof(int) * p->total); 
displ = (int*) malloc (sizeof(int) * p->total);

}
MPI_Gatherv(f->err, f->slice_size, MPI_DOUBLE,
    (void*) errs, rcounts, displ, 
    MPI_DOUBLE, 0, MPI_COMM_WORLD);
printf("P:%d done with gather\n", p->rank);

f->err represents the array the array that I am trying to send and f->slice_size is size of that array. First printf prints correct values on all 4 processes, however last printf executes on all processes except for process 0.

I get below exception

P:0 calling gather with count 250
P:1 calling gather with count 250
P:1 done with gather
P:2 calling gather with count 250
P:2 done with gather
P:3 calling gather with count 250
P:3 done with gather
    [cli_0]: aborting job:
    Fatal error in PMPI_Gatherv:
    Invalid count, error stack:
    PMPI_Gatherv(547): MPI_Gatherv failed(sbuf=0x2588290, scount=0, MPI_DOUBLE, rbuf=0x2588a70, rcnts=0x2548750, displs=0x2546d90, MPI_DOUBLE, root=0, MPI_COMM_WORLD) failed
    PMPI_Gatherv(455): Negative count, value is -1908728888
1

There are 1 best solutions below

1
On BEST ANSWER

The snippet suggests some confusion about MPI_Gatherv() semantic. rcounts and displ are input arguments that are use read-only by MPI_Gatherv(). These arrays must be properly initialized before MPI_Gatherv() is invoked. If the root rank does not know how much data will be sent by the other ranks, then some extra logic must be manually added in order to retrieve this information. MPI_Gather() can be used to retrieve rcounts, and then displ can be built from rcounts.

A somehow similar question was already asked and eloquently answered at How to use MPI_Gatherv for collecting strings of diiferent length from different processor including master node?