Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] MPI_Alltoall with Vector Datatype
From: Spenser Gilliland (spenser_at_[hidden])
Date: 2014-05-08 18:16:56


George,

> Here is basically what is happening. On the top left, I depicted the datatype resulting from the vector type. The two arrows point to the lower bound and upper bound (thus the extent) of the datatype. On the top right, the resized datatype, where the ub is now moved 2 elements after the lb, allowing for a nice interleaving of the data. Then the next line is the unrolled datatype representation, flatten to a 1D. Again it contains in red the data touched by the defined memory layout, as well as the extent (lb and ub).
>
> Now, let’s move on the MPI_Alltoall call. The array is the one without colors, and then I put the datatype starting from the position you specified in the alltoall. As you can see as soon as you don’t start at the origin of the allocated memory, you end-up writing outside of your data. This happens deep inside the MPI_Alltoall call (no validation at the MPI level).

Why are the last two elements in the 1D view present? If that's the
case I would have to define a new MPI Type for each set of columns
within a matrix. Why would it be defined in this manner? Also, why
is the extent of the initial vector type equal to 12 when it is
actually accessing 16 elements (for the 4x4 example).

So, is this a bug in Alltoall or openmpi?

I believe it is all to all causing the bug and not vector because the following

MPI_Aint lb, extent, true_lb, true_extent;
MPI_Type_get_extent(mpi_all_t, &lb, &extent);
MPI_Type_get_true_extent(mpi_all_t, &true_lb, &true_extent);
printf("mpi_all_t - lb = %d, extent = %d, true_lb = %d, true_extent =
%d\n", lb, extent, true_lb, true_extent);

produces

mpi_all_t - lb = 0, extent = 16, true_lb = 0, true_extent = 240

Which means that the size is correct (using 4 byte floats with 2
processor on an 8x8 array this would be the 30th element).

There's a similar drawing to what you made attached that's more
focused on the specific instance in this code. Hopefully, this clears
up the algorithm a bit.

Thanks,
Spenser

-- 
Spenser Gilliland
Computer Engineer
Doctoral Candidate