Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Segmentation fault in MPI_Finalize with IB hardware and memory manager.
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2010-06-07 15:49:06


George --

Scott's patch was different than the one you applied. Apparently, his fixes this user's problem (I don't know if Guillaume tested yours).

Which one wins?

On Jun 3, 2010, at 9:49 AM, Scott Atchley wrote:

> On Jun 3, 2010, at 8:54 AM, guillaume ranquet wrote:
>
> > granquet_at_bordeplage-15 ~ $ mpirun --mca btl mx,openib,sm,self --mca pml
> > ^cm --mca mpi_leave_pinned 0 ~/bwlat/mpi_helloworld
> > [bordeplage-15.bordeaux.grid5000.fr:02707] Error in mx_init (error No MX
> > device entry in /dev.)
> > Hello world from process 0 of 1
> >
> > it works :)
>
> Jeff, you may want to change this message to opal_output_verbose(). It is in $OMPI/ompi/mca/common/common_mx.c.
>
> >> Ok. I think that OMPI is trying to open the MX MTL first. It fails at
> >> mx_init() (the first error message) but it had already created some
> >> mpool resources. It then tries to open the MX BTL and it skips the MX
> >> initialization and returns SUCCESS. The MX BTL then tries to call
> >> mx_get_info() which fails and prints the second message.
> >>
> >> Try the attached patch. It tries to clean up if mx_init() fails and
> >> does not return SUCCESS on subsequent attempts to initialize MX.
> >>
> >> Scott
> >
> > I tried your patch and it seems to correct the issue:
> >
> > configured with: --prefix=$HOME/openmpi-1.4.2-nomx-bin/
> > - --with-openib=/usr --with-mx=/usr
> >
> > $ ~/openmpi-1.4.2-nomx-bin/bin/mpirun ~/bwlat/mpi_helloworld
> > [bordeplage-15.bordeaux.grid5000.fr:22406] Error in mx_init (error No MX
> > device entry in /dev.)
> > Hello world from process 0 of 1
>
> Excellent.
>
> > don't hesitate if you need further testing :)
>
> Thanks for all your assistance!
>
> > do you plan on applying this patch on next release? (1.4.3?)
>
> Jeff, I leave this up to you and George.
>
> Scott
>

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/