George Bosilca wrote:
> Yes, in Open MPI the connections are usually created on demand. As far
> as I know there are few devices that do not abide to this "law", but
> MX is not one of them.
> To be more precise on how the connections are established, if we say
> that each node has two rails and we're doing a ping-pong, the first
> message from p0 to p1 will connect the first NIC, and the second
> message the second NIC (here I made the assumption that both network
> are similar). Moreover in MX, the connection is not symmetric, so your
> (1) and (2) might happens simultaneously.
Ok. I still don't see why I couldn't reproduce the problem with MX when
the progression thread was disabled. But I found a way to work-around
the problem in Open-MX so we should be good now.
> Does the code contain an MPI_Barrier ? If yes, this might be why you
> see the sequence (1), (2), (3) and (4) ...
It was hanging during startup in the Intel MPI Benchmarks. Looks like
MPI_Comm_split() in IMB_set_communicator() was causing the problem.
thanks a lot