On Mar 12, 2010, at 2:08 PM, Nick Edmonds wrote:
> Currently the openib BTL silently refuses to run when MPI_THREAD_MULTIPLE is specified (ompi/mca/btl/openib/btl_openib_component.c:2367 in the current trunk, r22822) which leads to confusing (to some people) error messages such as:
> PML add procs failed --> Returned "Unreachable" (-12) instead of "Success" (0)
> Would it be possible to provide a warning/error indicating that the BTL failed to load, and why?
Hmm. My first thought was, "sure!" But then after thinking about it for a minute, I realized that we actually *want* it to fail silently -- it's in keeping with the Open MPI philosophy of just using what's available and not complaining about what's not available. More specifically, how exactly would you know when a user wants you to complain when something is not available?
That being said, we can probably improve the "PML add procs failed..." message to make it more clear. For example, this specific message means that some peer is unreachable, which *usually* means that there's no BTL available to reach it, which *usually* means that a BTL failed to load as you expected it to. I'll make this better -- thanks for the heads-up.
> The logical next question would be, is anyone working on an openib BTL that supports MPI_THREAD_MULTIPLE? I'm currently stuck using IPoIB which is obviously undesirable from a performance standpoint.
IBM has been doing a bunch of MPI_THREAD_MULTIPLE improvements recently, many of them surrounding the openib BTL. I don't know what their specific goals are, though.
For corporate legal information go to: