I have noticed an issue in the 1.2 branch when mpi_preconnect_all=1. The
one way communication pattern (ranks either send or receive from each
other) may not fully establish connection with peers. Example, if I have
a 3 process mpi job and rank 0 does not do any mpi communication after
MPI_Init() the other ranks attempts to connect will not be progressed (I
have seen this with tcp and udapl).
The preconnect pattern has changed slightly in the trunk but essentially
it is still a one way communication, either send or receive with each
rank. So although the issue I see in the 1.2 branch does not appear in
the trunk I wonder if this will show up again.
An alternative to the preconnect pattern that comes to mind would be to
perform a send and receive between all ranks to ensure that connections
have been fully established.
Does anyone have thoughts or comments on this, or reasons not to have
all ranks send and receive from all?