Segmentation fault in MPI_Send for derived data types

268 Views Asked by At

In the code below, if MPI_Get_address(&member, &offset[0]); is replaced with offset[0] = 0; the code works as expected otherwise it gives the output below. To my knowledge, for the usage of MPI_BOTTOM absolute memory addresses are needed and this is why MPI_Get_address() is used. While struct Member has no problem with MPI_Get_address(), struct Family did not work with it. What is the problem?

Command:

mpirun -n 2 ./out

Output:

Signal: Segmentation fault (11)
Signal code:  (128)
Failing at address: (nil)
mpirun noticed that process rank 0...

Code:

#include <mpi.h>

struct Member
{
    double height;
    MPI_Datatype mpi_dtype;
    Member() { make_layout(); }
    void make_layout()
    {
        int nblock = 1;
        int block_count[nblock] = {1};
        MPI_Aint offset[nblock];
        MPI_Get_address(&height, &offset[0]);
        MPI_Datatype block_type[nblock] = {MPI_DOUBLE};
        MPI_Type_create_struct(nblock, block_count, offset, block_type, &mpi_dtype);
        MPI_Type_commit(&mpi_dtype);
    }
};

struct Family
{
    Member member;
    MPI_Datatype mpi_dtype;
    Family() { make_layout(); }
    void make_layout()
    {
        int nblock = 1;
        int block_count[nblock] = {1};
        MPI_Aint offset[nblock]; 
        MPI_Get_address(&member, &offset[0]);
        //offset[0] = 0; <-- HERE!!!!!!!
        MPI_Datatype block_type[nblock] = {member.mpi_dtype};
        MPI_Type_create_struct(nblock, block_count, offset, block_type, &mpi_dtype);
        MPI_Type_commit(&mpi_dtype);
    }
};

int main()
{
    int rank;

    MPI_Init(NULL, NULL);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);

    if (rank == 0)
    {
        Family f;
        MPI_Send(MPI_BOTTOM, 1, f.mpi_dtype, 1, 0, MPI_COMM_WORLD);
    }
    else if (rank == 1)
    {
        Family f;
        MPI_Recv(MPI_BOTTOM, 1, f.mpi_dtype, 0, 0, MPI_COMM_WORLD, NULL);
    }

    MPI_Finalize();

    return 0;
}
1

There are 1 best solutions below

3
On

member.mpi_dtype already carries the absolute address of Member::height as offset in its typemap. When in Family::make_layout() you specify offset[0] equal to the address of member, both offsets sum up, which results in a very wrong offset. For that very reason MPI datatypes with absolute addresses should not be used in constructing other datatypes.

There is absolutely no reason to use MPI_BOTTOM in that case - your structures have no dynamically allocated fields and thus MPI datatypes with relative offsets should suffice.