Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Anyscientific application heavily using MPI_Barrier?
From: Eugene Loh (Eugene.Loh_at_[hidden])
Date: 2009-08-24 16:23:30


Jeff Squyres wrote:

> On Aug 24, 2009, at 1:03 PM, Eugene Loh wrote:
>
>> E.g., let's say P0 and P1 each send a message to P2, both using the
>> same tag and communicator. Let's say P2 does two receives on that
>> communicator and tag, using a wildcard source. So, the messages
>> could be received in either order. One could introduce barriers to
>> order the messages. E.g.,
>>
>> P0:
>> Send
>> Barrier
>> P1:
>> Barrier
>> Send
>> P2:
>> Recv
>> Barrier
>> Recv
>
> Is this behavior *guaranteed* by MPI? I'm not actually sure that it
> is; barrier does not provide any guarantees about point-to-point
> message passing progress.
>
> For example, how about a machine with these assumptions:
>
> - P0 is "far away" from P2 on the point-to-point network
> - P1 is "close by" to P2 on the point-to-point network
> - Barriers go across a separate/fast network (think: bluegene)
> - P0's send message is short/eager
>
> In this case, the Send from P0 complete "immediately" and enter the
> barrier before it is delivered to P2. The P0 send could then take a
> "long time" to get to P2 --

Okay, so let's say P0 completes its send and enters the barrier.

Also, P1 enters the barrier. But it will not issue a send until it
leaves the barrier, which requires that the last process has entered the
barrier.

Meanwhile, the last process, P2, is waiting on a receive before it
enters the barrier.

So, here's the situation. P2 is waiting to receive a message, a message
has been sent to P2, and no other message will be sent to P2 until some
message has been received. So, there are only two options:

1) The first receive on P2 receives the message from P0. Or,

2) This perfectly legal MPI program deadlocks.

Right?

> potentially long enough for the barrier to overtake it

No. The first Recv on P2 has to complete before P2 can enter the
barrier, which is a prerequisite for the barrier to complete on any process.

> and for the Send from P1 to be delivered to P2 before the Send from
> P0 arrives at P2.
>
> Couldn't that happen?

No. The send on P1 cannot be issued before the barrier completes on P1,
which cannot happen before the barrier is entered on P2, which cannot
happen before the first Recv on P2 is completed, which cannot happen
until some message is received on P2. And, the only message that can be
received on P2 is the one issued by P0.

> Granted, I would expect that your example would perform in most real-
> world situations as you describe (P0 is delivered to P2, then P1 is
> delivered to P2). But I don't think the standard guarantees it.