On Jun 13, 2007, at 11:15 AM, Nysal Jan wrote:
I was just bitten yesterday by a problem that I've known about for a
while but had never gotten around to looking into (I could have sworn
that there was an open trac ticket on this, but I can't find one
I have 2 hosts: one with 3 active ports and one with 2 active ports.
If I run an MPI job between them, the openib BTL wireup got badly and
it aborts. So handling a heterogeneous number of ports is not
currently handled properly in the code.
I don't know if Gleb's patch addresses this situation or not; I'll
look at his patch this afternoon.
There is a ticket (closed) here: https://svn.open-mpi.org/trac/ompi/ticket/548
It was fixed by Galen for 1.2. There is a FAQ entry also about this http://www.open-mpi.org/faq/?category=openfabrics#ofa-port-wireup
I think Gleb's patch addresses a potential race condition when both sides attempt to connect at the same time.
devel mailing list