On the root there is a matrix with size rheight and rwidth. There are rr processors and I want to send a widhtheight submatrix to each of them. In my lecture i am learning about the commands in the title, but i am a little confused about how to use them.
I allocate the matrix and the submatrix called tile on every process. I create a datatype for the submatrices called "tile". I resize it, so we can interleaf the different tiles.
// Allocated on every process, fixed in real code
std::vector<double> everything (height*width*r*r); // r is 3 later, because we have 9 submatrices
std::vector<double> tile (height*width); // width and height are both 3 here
MPI_Datatype vector_t, tile_t;
MPI_Type_vector(height, width, width*r, MPI_DOUBLE, &vector_t);
MPI_Type_create_resized(vector_t, 0, width, &tile_t);
MPI_Type_commit(&tile_t);
Here is an example of how i want to scatter the matrix: each number should later become one submatrix. There are 3 submatrices in one "row". The resizing of tile_t makes it, so that the first tile starts at index 0, the 2nd starts at index 3, and the 3rd starts at index 6. This way they interleaf.
-------------
|000|111|222|
|000|111|222|
|000|111|222|
-------------
What i dont get is how i can use scatterv to make a bigger jump. A potential 4th submatrix in the next "row" would need to start at index 27. Here is how i imagine scatterv should be used on the below matrix.
-------------
|000|111|222|
|000|111|222|
|000|111|222|
-------------
|333|444|555|
|333|444|555|
|333|444|555|
-------------
|666|777|888|
|666|777|888|
|666|777|888|
-------------
int dspls = [0,1,2,9,10,11,18,19,20];
int counts = [1,1,1,1,1,1,1,1,1];
MPI_Scaterv(
everything.data(),
&counts, // one tile already has 9 INT in it
&dspls,
tile_t, // send type
tile.data(),
9, // recv count
MPI_INT,
rank_of_root,
MPI_COMM_WORLD
)
Now i understood it this way: To calculate the adress of a certain thing scatterv wants to send, it does this:
displs[i]*sizeof(tile_t)
and because i resized tile_t, this expression equals:
displs[i] * width = displs[i] * 3
or it does
displs[i] * width * sizeof(int) = displs[i] * 3 * sizeof(int)
What should i put into dspls, for the code to make sense?