On Jun 13, 2007, at 12:08 PM, Gleb Natapov wrote:
> I am not committing this yet. I want people to review my logic and the
> patch. If the change is OK with everyone how cares then I want this
> change to go into 1.2 branch.
>
> I don't care how this change will get to the trunk. I can use patched
> version for a while. If you branch is in working state right now I can
> merge this change into it tomorrow.
I was just bitten yesterday by a problem that I've known about for a
while but had never gotten around to looking into (I could have sworn
that there was an open trac ticket on this, but I can't find one
anywhere).
I have 2 hosts: one with 3 active ports and one with 2 active ports.
If I run an MPI job between them, the openib BTL wireup got badly and
it aborts. So handling a heterogeneous number of ports is not
currently handled properly in the code.
I don't know if Gleb's patch addresses this situation or not; I'll
look at his patch this afternoon.
--
Jeff Squyres
Cisco Systems
|