Guess I should have kept quiet a bit longer. As I recall we had already seen a counter example to Jeff's stronger statement and that motivated my narrower one.

Do you have a counter example for my more cautious assertion? ( I had already granted that a correct MPI program could be made incorrect with a barrier and the barrier that broke it would have to be considered be "semantically relevant". I would reword the statement with that in mind if I were to offer it up again. )


Dick Treumann - MPI Team
IBM Systems & Technology Group
Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
Tele (845) 433-7846 Fax (845) 433-8363


users-bounces@open-mpi.org wrote on 08/24/2009 04:23:30 PM:

> [image removed]

>
> Re: [OMPI users] Anyscientific application heavily using MPI_Barrier?

>
> Eugene Loh

>
> to:

>
> Open MPI Users

>
> 08/24/2009 04:25 PM

>
> Sent by:

>
> users-bounces@open-mpi.org

>
> Please respond to Open MPI Users

>
> Jeff Squyres wrote:
>
> > On Aug 24, 2009, at 1:03 PM, Eugene Loh wrote:
> >
> >> E.g., let's say P0 and P1 each send a message to P2, both using the  
> >> same tag and communicator.  Let's say P2 does two receives on that  
> >> communicator and tag, using a wildcard source.  So, the messages  
> >> could be received in either order.  One could introduce barriers to  
> >> order the messages.  E.g.,
> >>
> >> P0:
> >>   Send
> >>   Barrier
> >> P1:
> >>   Barrier
> >>   Send
> >> P2:
> >>   Recv
> >>   Barrier
> >>   Recv
> >
> > Is this behavior *guaranteed* by MPI?  I'm not actually sure that it  
> > is; barrier does not provide any guarantees about point-to-point  
> > message passing progress.
> >
> > For example, how about a machine with these assumptions:
> >
> > - P0 is "far away" from P2 on the point-to-point network
> > - P1 is "close by" to P2 on the point-to-point network
> > - Barriers go across a separate/fast network (think: bluegene)
> > - P0's send message is short/eager
> >
> > In this case, the Send from P0 complete "immediately" and enter the  
> > barrier before it is delivered to P2.  The P0 send could then take a  
> > "long time" to get to P2 --
>
> Okay, so let's say P0 completes its send and enters the barrier.
>
> Also, P1 enters the barrier.  But it will not issue a send until it
> leaves the barrier, which requires that the last process has entered the
> barrier.
>
> Meanwhile, the last process, P2, is waiting on a receive before it
> enters the barrier.
>
> So, here's the situation.  P2 is waiting to receive a message, a message
> has been sent to P2, and no other message will be sent to P2 until some
> message has been received.  So, there are only two options:
>
> 1) The first receive on P2 receives the message from P0.  Or,
>
> 2) This perfectly legal MPI program deadlocks.
>
> Right?
>
> > potentially long enough for the barrier to  overtake it
>
> No.  The first Recv on P2 has to complete before P2 can enter the
> barrier, which is a prerequisite for the barrier to complete on any process.
>
> > and for the Send from P1 to be delivered to P2 before the  Send from
> > P0 arrives at P2.
> >
> > Couldn't that happen?
>
> No.  The send on P1 cannot be issued before the barrier completes on P1,
> which cannot happen before the barrier is entered on P2, which cannot
> happen before the first Recv on P2 is completed, which cannot happen
> until some message is received on P2.  And, the only message that can be
> received on P2 is the one issued by P0.
>
> > Granted, I would expect that your example would perform in most real-
> > world situations as you describe (P0 is delivered to P2, then P1 is  
> > delivered to P2).  But I don't think the standard guarantees it.
>
> _______________________________________________
> users mailing list
> users@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users