Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] ga-4.1 on mx segmentation violation
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2008-10-22 11:41:56


On Oct 21, 2008, at 9:14 AM, SLIM H.A. wrote:

> I have built the release candidate for ga-4.1 with OpenMPI 1.2.3 and
> portland compilers 7.0.2 for Myrinet mx.
>
> Running the test.x for 3 Myrinet nodes each with 4 cores I get the
> following error messages:
>
> warning:regcache incompatible with malloc
> libibverbs: Fatal: couldn't read uverbs ABI version.
> ------------------------------------------------------------------------
> --
> [0,1,3]: OpenIB on host node057 was unable to find any HCAs.
> Another transport will be used instead, although this may result in
> lower performance.
> -----------------------------------------------------------------------

FWIW, this specific warning is fixed in the upcoming v1.3 series (I
assume you built on a machine with libibverbs installed, but no
OpenFabrics-capable devices).

IIRC, you can manually disable this warning by telling Open MPI to
avoid the openib BTL (I can't test the v1.2 series on a linux machine
ATM to verify this):

   mpirun --mca btl ^openib ...

> ARMCI configured for 3 cluster nodes. Network protocol is 'MPI-SPAWN'.
> 0:Segmentation Violation error, status=: 11
> 0:ARMCI DASSERT fail. signaltrap.c:SigSegvHandler():299 cond:0
> 4:Segmentation Violation error, status=: 11
> 4:ARMCI DASSERT fail. signaltrap.c:SigSegvHandler():299 cond:0
> 6:Segmentation Violation error, status=: 11
> 6:ARMCI DASSERT fail. signaltrap.c:SigSegvHandler():299 cond:0

It looks like ARMCI is seg faulting...? Beyond that, Bad Things will
happen at the MPI layer before it aborts.

I'm unfamiliar with "ga" or ARMCI, so I don't know exactly what's
happening here...

-- 
Jeff Squyres
Cisco Systems