Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Program deadlocks, on simple send/recv loop
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2009-12-03 12:21:50


On Dec 3, 2009, at 10:56 AM, Brock Palen wrote:

> The allocation statement is ok:
> allocate(vec(vec_size,vec_per_proc*(size-1)))
>
> This allocates memory vec(32768, 2350)

So this allocates 32768 rows, each with 2350 columns -- all stored contiguously in memory, in column-major order. Does the language/compiler *guarantee* that the entire matrix is contiguous in memory? Or does it only guarantee that the *columns* are contiguous in memory -- and there may be gaps between successive columns?

2350 means you're running with 48 procs.

In the loop:

     do irank=1,size-1
        do ivec=1,vec_per_proc
           write (6,*) 'irank=',irank,'ivec=',ivec
           vec_ind=(irank-1)*vec_per_proc+ivec
           call MPI_RECV( vec(1,vec_ind), vec_size, MPI_DOUBLE_COMPLEX, irank, &
                vec_ind, MPI_COMM_WORLD, status, ierror)

This means that in the first iteration, you're calling:

irank = 1
ivec = 1
vec_ind = (47 - 1) * 50 + 1 =
call MPI_RECV(vec(1, 2301), 32768, ...)

And in the last iteration, you're calling:

irank = 47
ivec = 50
vec_ind = (47 - 1) * 50 + 50 =
call MPI_RECV(vec(1, 2350), 32768, ...)

That doesn't seem right. If I'm reading this right -- and I very well may not be -- it looks like successive receives will be partially overlaying the previous receive. Is that what you intended? Is MPI supposed to overflow the columns properly? I can see how a problem might occur here if the columns are not actually contiguous in memory...?

-- 
Jeff Squyres
jsquyres_at_[hidden]