Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2007-11-07 11:11:39

Don, Galen, and I talked about this in depth on the phone today and
think that it is a symptom of the same issue discussed in this thread:

Note my message in that thread from just a few minutes ago:

We think that the proposed solution to that thread will also fix the
mpi_preconnect_all issues (i.e., the ping-pong that Don proposes in
his mail should not be necessary).

On Oct 17, 2007, at 10:54 AM, Don Kerr wrote:

> All,
> I have noticed an issue in the 1.2 branch when mpi_preconnect_all=1.
> The
> one way communication pattern (ranks either send or receive from each
> other) may not fully establish connection with peers. Example, if I
> have
> a 3 process mpi job and rank 0 does not do any mpi communication after
> MPI_Init() the other ranks attempts to connect will not be
> progressed (I
> have seen this with tcp and udapl).
> The preconnect pattern has changed slightly in the trunk but
> essentially
> it is still a one way communication, either send or receive with each
> rank. So although the issue I see in the 1.2 branch does not appear in
> the trunk I wonder if this will show up again.
> An alternative to the preconnect pattern that comes to mind would be
> to
> perform a send and receive between all ranks to ensure that
> connections
> have been fully established.
> Does anyone have thoughts or comments on this, or reasons not to have
> all ranks send and receive from all?
> -DON
> _______________________________________________
> devel mailing list
> devel_at_[hidden]

Jeff Squyres
Cisco Systems