I hope this is of any help (debugged with TotalView):
Enclose you will find a graph from TotalView as well as this:
/Created process 2 (7633), named "mpirun"
Thread 2.1 has appeared
Thread 2.2 has appeared
Thread 2.1 received a signal (Segmentation Violation)/
and the stack trace:
/ _mca_pls_xgrid_set_node_name, FP=bffff090
-[PlsXGridClient launchJob:], FP=bffff100
and this (bold crashed):
/ 0x00257680: 0x805e0044 lwz rtoc,68(r30)
0x00257684: 0x38000001 li r0,1
*0x00257688: 0x90020010 stw r0,16(rtoc)*
0x0025768c: 0x805e0044 lwz rtoc,68(r30)
0x00257690: 0x38008000 li r0,-32768/
from function /_mca_pls_xgrid_set_node_name/ in /mca_pls_xgrid.so/
Unfortunately I'm not yet familiar with TotalView, so let me know if you
like to get more output (sorry: haven't found dbx for Mac OS X -> that's
why TotalView was used)
Date: Wed, 28 Jun 2006 10:35:03 -0400
From: "Terry D. Dontje" <Terry.Dontje_at_[hidden]>
Subject: [OMPI users] Re : OpenMPI 1.1: Signal:10,
info.si_errno:0(Unknown, error: 0), si_code:1(BUS_ADRALN)
Content-Type: text/plain; format=flowed; charset=ISO-8859-1
Can you set your limit coredumpsize to non-zero rerun the program
and then get the stack via dbx?
So, I have a similar case of BUS_ADRALN on SPARC systems with an
older version (June 21st) of the trunk. I've since run using the latest
trunk and the
bus went away. I am now going to try this out with v1.1 to see if I get
results. Your stack would help me try and determine if this is an
or possibly some type of platform problem.
There is another thread with Eric Thibodeau that I am unsure if it is
the same issue
as either of our situation.
> >Message: 3
> >Date: Wed, 28 Jun 2006 14:30:12 +0200
> >From: openmpi-user <openmpi-user_at_[hidden]>
> >Subject: Re: [OMPI users] OpenMPI 1.1: Signal:10
> > info.si_errno:0(Unknown, error: 0), si_code:1(BUS_ADRALN) (Terry D.
> > Dontje)
> >To: users_at_[hidden]
> >Message-ID: <44A27654.9060002_at_[hidden]>
> >Content-Type: text/plain; charset="iso-8859-1"
> >Hi Terry,
> >unfortunately I haven't got a stack trace.
> >OS: Mac OS X 10.4.7 Server on the Xgrid-server and Mac OS X 10.4.7
> >Client on every node (G4 and G5). For testing-purposes I've installed
> >OpenMPI 1.1 on a Dual-G4-node and on a Dual-G5-node with my Xgrid
> >consisting of only either the Dual-G4- or the Dual-G5-node. No matter
> >which configuration, I ran into the bus error.