I found a presentation on the web that showed significant performance
benefits for the one-sided communication, I presumed it was from hardware
RDMA support that the one-sided calls could take advantage of. But I gather
from the your question that is not necessarily the case. Are you aware of
cases in which it has made a significant difference?
On 12/15/10 9:18 PM, "Jeff Squyres" <jsquyres_at_[hidden]> wrote:
> Is there a reason to convert your code from send/receive to put/get?
> The performance may not be that significantly different, and as you have
> noted, the MPI-2 put/get semantics are a total nightmare to understand (I
> personally advise people not to use them -- MPI-3 is cleaning up the put/get
> semantics a LOT).
> On Dec 15, 2010, at 3:15 PM, Grismer, Matthew J Civ USAF AFMC AFRL/RBAT wrote:
>> I am trying to modify the communication routines in our code to use
>> MPI_Put's instead of sends and receives. This worked fine for several
>> variable Put's, but now I have one that is causing seg faults. Reading
>> through the MPI documentation it is not clear to me if what I am doing
>> is permissible or not. Basically, the question is this - if I have
>> defined all of an array as a window on each processor, can I PUT data
>> from that array to remote processes at the same time as the remote
>> processes are PUTing into the local copy, assuming no overlaps of any of
>> the PUTs?
>> Here are the details if that doesn't make sense. I have a (Fortran)
>> array QF(6,2,N) on each processor, where N could be a very large number
>> (100,000). I create a window QFWIN on the entire array on all the
>> processors. I define MPI_Type_indexed "sending" datatypes (QFSND) with
>> block lengths of 6 that send from QF(1,1,*), and MPI_Type_indexed
>> "receiving" datatypes (QFREC) with block lengths of 6 the receive into
>> QF(1,2,*). Here * is non-repeating set of integers up to N. I create
>> groups of processors that communicate, where these groups will all
>> exchange QF data, PUTing local QF(1,1,*) to remote QF(1,2,*). So,
>> processor 1 is PUTing QF data to processors 2,3,4 at the same time 2,3,4
>> are putting their QF data to 1, and so on. Processors 2,3,4 are PUTing
>> into non-overlapping regions of QF(1,2,*) on 1, and 1 is PUTing from
>> QF(1,1,*) to 2,3,4, and so on. So, my calls look like this on each
>> assertion = 0
>> call MPI_Win_post(group, assertion, QFWIN, ierr)
>> call MPI_Win_start(group, assertion, QFWIN, ierr)
>> do I=1,neighbors
>> call MPI_Put(QF, 1, QFSND(I), NEIGHBOR(I), 0, 1, QFREC(I), QFWIN,
>> end do
>> call MPI_Win_complete(QFWIN,ierr)
>> call MPI_Win_wait(QFWIN,ierr)
>> Note I did define QFREC locally on each processor to properly represent
>> where the data was going on the remote processors. The error value
>> ierr=0 after MPI_Win_post, MPI_Win_start, MPI_Put, and MPI_Win_complete,
>> and the code seg faults in MPI_Win_wait.
>> I'm using Open MPI 1.4.3 on Mac OS X 10.6.5, built with Intel XE (12.0)
>> compilers, and running on just 2 (internal) processors of my Mac Pro.
>> The code ran normally with this configuration up until the point I put
>> the above in. Several other communications with MPI_Put similar to the
>> above work fine, though the windows are only on a subset of the
>> communicated array, and the origin data is being PUT from part of the
>> array that is not within the window.
>> users mailing list