On Jun 11, 2012, at 7:48 PM, Yong Qin wrote:
> ah, I guess my original understanding of PML was wrong. Adding "-mca
> pml ob1" does help to ease the problem.
See the README for a little more discussion about this issue. There can only be 1 PML in use by a given MPI job -- using "--mca pml ob1" forces the use of the "ob1" PML (i.e., the BTLs), as opposed to the "cm" MTL (i.e., the MTLs).
> But the question still
> remains. Why ompi decided to use the mx BTL in the first place, given
> there's no physical device onboard at all? This behavior is completely
> different than the original gm BTL.
That's not what is actually happening.
Open MPI *built* with MX support, and it therefore assumes that you will likely want to use it. So it *warns* you when there is no MX device available.
That being said, I have recently run into the issue you are seeing: if OMPI 1.6 warns you that there is no high-speed device available (openib in my case), it then segv's (which it obviously shouldn't -- it should warn and then die gracefully). I'll open a ticket on this behavior. It's not a common scenario, but we still shouldn't segv.
My first guess is that this has something to do with the memory manager... but that's a guess.
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/