Jeff Squyres wrote:
I think you're getting one big contiguous block of memory and the
portions that are passed are contiguous, nonoverlapping pieces.
On Dec 3, 2009, at 10:56 AM, Brock Palen wrote:
The allocation statement is ok:
This allocates memory vec(32768, 2350)
So this allocates 32768 rows, each with 2350 columns -- all stored contiguously in memory, in column-major order. Does the language/compiler *guarantee* that the entire matrix is contiguous in memory? Or does it only guarantee that the *columns* are contiguous in memory -- and there may be gaps between successive columns?
No. In Fortran, leftmost index varies the fastest. E.g.,
This means that in the first iteration, you're calling:
call MPI_RECV(vec(1, 2301), 32768, ...)
And in the last iteration, you're calling:
call MPI_RECV(vec(1, 2350), 32768, ...)
That doesn't seem right. If I'm reading this right -- and I very well may not be -- it looks like successive receives will be partially overlaying the previous receive.
% cat y.f90
a(1,1) = 11
a(2,1) = 21
a(1,2) = 12
a(2,2) = 22
11 21 12 22
Here is how I think of Brock's code:
integer, parameter :: n = 32 * 1024, m = 50
call MPI_COMM_SIZE(MPI_COMM_WORLD, np, ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD, me, ierr)
buf = 0
if ( me == 0 ) then
do i = 1, np-1
do j = 1, m
call MPI_RECV(buf, n, MPI_DOUBLE_COMPLEX, i, j,
MPI_COMM_WORLD, MPI_STATUS_IGNORE, ierr)
do j = 1, m
call MPI_SEND( x, n, MPI_DOUBLE_COMPLEX, 0, j, MPI_COMM_WORLD,
This version reuses send and receive buffers, but that's fine since
they're all blocking calls anyhow.