Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] Myricom MX2G Segmentation fault on OMPI 1.6
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2012-06-15 09:41:07


On Jun 11, 2012, at 7:48 PM, Yong Qin wrote:

> ah, I guess my original understanding of PML was wrong. Adding "-mca
> pml ob1" does help to ease the problem.

See the README for a little more discussion about this issue. There can only be 1 PML in use by a given MPI job -- using "--mca pml ob1" forces the use of the "ob1" PML (i.e., the BTLs), as opposed to the "cm" MTL (i.e., the MTLs).

> But the question still
> remains. Why ompi decided to use the mx BTL in the first place, given
> there's no physical device onboard at all? This behavior is completely
> different than the original gm BTL.

That's not what is actually happening.

Open MPI *built* with MX support, and it therefore assumes that you will likely want to use it. So it *warns* you when there is no MX device available.

That being said, I have recently run into the issue you are seeing: if OMPI 1.6 warns you that there is no high-speed device available (openib in my case), it then segv's (which it obviously shouldn't -- it should warn and then die gracefully). I'll open a ticket on this behavior. It's not a common scenario, but we still shouldn't segv.

My first guess is that this has something to do with the memory manager... but that's a guess.

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/