Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: openmpi-user (openmpi-user_at_[hidden])
Date: 2006-06-28 08:30:12


Hi Terry,

unfortunately I haven't got a stack trace.

OS: Mac OS X 10.4.7 Server on the Xgrid-server and Mac OS X 10.4.7
Client on every node (G4 and G5). For testing-purposes I've installed
OpenMPI 1.1 on a Dual-G4-node and on a Dual-G5-node with my Xgrid
consisting of only either the Dual-G4- or the Dual-G5-node. No matter
which configuration, I ran into the bus error.

Yours,
Frank

users-request_at_[hidden] wrote:
> Send users mailing list submissions to
> users_at_[hidden]
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> or, via email, send a message with subject or body 'help' to
> users-request_at_[hidden]
>
> You can reach the person managing the list at
> users-owner_at_[hidden]
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of users digest..."
>
>
> Today's Topics:
>
> 1. Re: OpenMPI 1.1: Signal:10 info.si_errno:0(Unknown, error: 0)
> si_code:1(BUS_ADRALN) (Terry D. Dontje)
> 2. Re: OpenMPI 1.1: Signal:10 info.si_errno:0(Unknown error: 0),
> si_code:1(BUS_ADRALN) (Frank) (Frank)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Wed, 28 Jun 2006 07:26:46 -0400
> From: "Terry D. Dontje" <Terry.Dontje_at_[hidden]>
> Subject: Re: [OMPI users] OpenMPI 1.1: Signal:10
> info.si_errno:0(Unknown, error: 0) si_code:1(BUS_ADRALN)
> To: users_at_[hidden]
> Message-ID: <44A26776.2000706_at_[hidden]>
> Content-Type: text/plain; format=flowed; charset=ISO-8859-1
>
> Frank,
>
> Do you possibly have a stack trace of the program that failed. Also,
> what OS and platform are you running on?
>
> --td
>
>
>> ----------------------------------------------------------------------
>>
>> Message: 1
>> Date: Wed, 28 Jun 2006 12:53:14 +0200
>> From: Frank <openmpi-user_at_[hidden]>
>> Subject: [OMPI users] OpenMPI 1.1: Signal:10 info.si_errno:0(Unknown
>> error: 0) si_code:1(BUS_ADRALN)
>> To: users_at_[hidden]
>> Message-ID: <44A25F9A.3070800_at_[hidden]>
>> Content-Type: text/plain; charset="iso-8859-1"
>>
>> Hi!
>>
>> I've recently updated to OpenMPI 1.1 on a few nodes and running into a
>> problem that wasn't there with OpenMPI 1.0.2.
>>
>> Submitting a job to the XGrid with OpenMPI 1.1 yields a Bus error that
>> isn't there when not submitting the job to the XGrid:
>>
>> [g5dual:/Network/CFD/MVH-1.0] motte% mpirun -d -np 2 ./vhone
>> [g5dual.3-net:08794] [0,0,0] setting up session dir with
>> [g5dual.3-net:08794] universe default-universe
>> [g5dual.3-net:08794] user motte
>> [g5dual.3-net:08794] host g5dual.3-net
>> [g5dual.3-net:08794] jobid 0
>> [g5dual.3-net:08794] procid 0
>> [g5dual.3-net:08794] procdir:
>> /tmp/openmpi-sessions-motte_at_g5dual.3-net_0/default-universe/0/0
>> [g5dual.3-net:08794] jobdir:
>> /tmp/openmpi-sessions-motte_at_g5dual.3-net_0/default-universe/0
>> [g5dual.3-net:08794] unidir:
>> /tmp/openmpi-sessions-motte_at_g5dual.3-net_0/default-universe
>> [g5dual.3-net:08794] top: openmpi-sessions-motte_at_g5dual.3-net_0
>> [g5dual.3-net:08794] tmp: /tmp
>> [g5dual.3-net:08794] [0,0,0] contact_file
>> /tmp/openmpi-sessions-motte_at_g5dual.3-net_0/default-universe/universe-setup.txt
>> [g5dual.3-net:08794] [0,0,0] wrote setup file
>> Signal:10 info.si_errno:0(Unknown error: 0) si_code:1(BUS_ADRALN)
>> Failing at addr:0x10
>> *** End of error message ***
>> Bus error
>> [g5dual:/Network/CFD/MVH-1.0] motte%
>>
>> When not xgrid-submitting the job with OpenMPI 1.1 everything is just fine:
>>
>> [g5dual:/Network/CFD/MVH-1.0] motte% mpirun -d -np 2 ./vhone
>> [g5dual.3-net:08957] procdir: (null)
>> [g5dual.3-net:08957] jobdir: (null)
>> [g5dual.3-net:08957] unidir:
>> /tmp/openmpi-sessions-motte_at_g5dual.3-net_0/default-universe
>> [g5dual.3-net:08957] top: openmpi-sessions-motte_at_g5dual.3-net_0
>> [g5dual.3-net:08957] tmp: /tmp
>> [g5dual.3-net:08957] connect_uni: contact info read
>> [g5dual.3-net:08957] connect_uni: connection not allowed
>> [g5dual.3-net:08957] [0,0,0] setting up session dir with
>> [g5dual.3-net:08957] tmpdir /tmp
>> [g5dual.3-net:08957] universe default-universe-8957
>> [g5dual.3-net:08957] user motte
>> [g5dual.3-net:08957] host g5dual.3-net
>> [g5dual.3-net:08957] jobid 0
>> [g5dual.3-net:08957] procid 0
>> [g5dual.3-net:08957] procdir:
>> /tmp/openmpi-sessions-motte_at_g5dual.3-net_0/default-universe-8957/0/0
>> [g5dual.3-net:08957] jobdir:
>> /tmp/openmpi-sessions-motte_at_g5dual.3-net_0/default-universe-8957/0
>> [g5dual.3-net:08957] unidir:
>> /tmp/openmpi-sessions-motte_at_g5dual.3-net_0/default-universe-8957
>> [g5dual.3-net:08957] top: openmpi-sessions-motte_at_g5dual.3-net_0
>> [g5dual.3-net:08957] tmp: /tmp
>> [g5dual.3-net:08957] [0,0,0] contact_file
>> /tmp/openmpi-sessions-motte_at_g5dual.3-net_0/default-universe-8957/universe-setup.txt
>> [g5dual.3-net:08957] [0,0,0] wrote setup file
>> [g5dual.3-net:08957] pls:rsh: local csh: 1, local bash: 0
>> [g5dual.3-net:08957] pls:rsh: assuming same remote shell as local shell
>> [g5dual.3-net:08957] pls:rsh: remote csh: 1, remote bash: 0
>> [g5dual.3-net:08957] pls:rsh: final template argv:
>> [g5dual.3-net:08957] pls:rsh: /usr/bin/ssh <template> orted --debug
>> --bootproxy 1 --name <template> --num_procs 2 --vpid_start 0 --nodename
>> <template> --universe motte_at_g5dual.3-net:default-universe-8957
>> --nsreplica "0.0.0;tcp://192.168.3.2:49281" --gprreplica
>> "0.0.0;tcp://192.168.3.2:49281" --mpi-call-yield 0
>> [g5dual.3-net:08957] pls:rsh: launching on node localhost
>> [g5dual.3-net:08957] pls:rsh: oversubscribed -- setting
>> mpi_yield_when_idle to 1 (1 2)
>> [g5dual.3-net:08957] pls:rsh: localhost is a LOCAL node
>> [g5dual.3-net:08957] pls:rsh: changing to directory /Users/motte
>> [g5dual.3-net:08957] pls:rsh: executing: orted --debug --bootproxy 1
>> --name 0.0.1 --num_procs 2 --vpid_start 0 --nodename localhost
>> --universe motte_at_g5dual.3-net:default-universe-8957 --nsreplica
>> "0.0.0;tcp://192.168.3.2:49281" --gprreplica
>> "0.0.0;tcp://192.168.3.2:49281" --mpi-call-yield 1
>> [g5dual.3-net:08958] [0,0,1] setting up session dir with
>> [g5dual.3-net:08958] universe default-universe-8957
>> [g5dual.3-net:08958] user motte
>> [g5dual.3-net:08958] host localhost
>> [g5dual.3-net:08958] jobid 0
>> [g5dual.3-net:08958] procid 1
>> [g5dual.3-net:08958] procdir:
>> /tmp/openmpi-sessions-motte_at_localhost_0/default-universe-8957/0/1
>> [g5dual.3-net:08958] jobdir:
>> /tmp/openmpi-sessions-motte_at_localhost_0/default-universe-8957/0
>> [g5dual.3-net:08958] unidir:
>> /tmp/openmpi-sessions-motte_at_localhost_0/default-universe-8957
>> [g5dual.3-net:08958] top: openmpi-sessions-motte_at_localhost_0
>> [g5dual.3-net:08958] tmp: /tmp
>> [g5dual.3-net:08962] [0,1,1] setting up session dir with
>> [g5dual.3-net:08962] universe default-universe-8957
>> [g5dual.3-net:08962] user motte
>> [g5dual.3-net:08962] host localhost
>> [g5dual.3-net:08962] jobid 1
>> [g5dual.3-net:08962] procid 1
>> [g5dual.3-net:08962] procdir:
>> /tmp/openmpi-sessions-motte_at_localhost_0/default-universe-8957/1/1
>> [g5dual.3-net:08962] jobdir:
>> /tmp/openmpi-sessions-motte_at_localhost_0/default-universe-8957/1
>> [g5dual.3-net:08962] unidir:
>> /tmp/openmpi-sessions-motte_at_localhost_0/default-universe-8957
>> [g5dual.3-net:08962] top: openmpi-sessions-motte_at_localhost_0
>> [g5dual.3-net:08962] tmp: /tmp
>> [g5dual.3-net:08960] [0,1,0] setting up session dir with
>> [g5dual.3-net:08960] universe default-universe-8957
>> [g5dual.3-net:08960] user motte
>> [g5dual.3-net:08960] host localhost
>> [g5dual.3-net:08960] jobid 1
>> [g5dual.3-net:08960] procid 0
>> [g5dual.3-net:08960] procdir:
>> /tmp/openmpi-sessions-motte_at_localhost_0/default-universe-8957/1/0
>> [g5dual.3-net:08960] jobdir:
>> /tmp/openmpi-sessions-motte_at_localhost_0/default-universe-8957/1
>> [g5dual.3-net:08960] unidir:
>> /tmp/openmpi-sessions-motte_at_localhost_0/default-universe-8957
>> [g5dual.3-net:08960] top: openmpi-sessions-motte_at_localhost_0
>> [g5dual.3-net:08960] tmp: /tmp
>> [g5dual.3-net:08957] spawn: in job_state_callback(jobid = 1, state = 0x4)
>> [g5dual.3-net:08957] Info: Setting up debugger process table for
>> applications
>> MPIR_being_debugged = 0
>> MPIR_debug_gate = 0
>> MPIR_debug_state = 1
>> MPIR_acquired_pre_main = 0
>> MPIR_i_am_starter = 0
>> MPIR_proctable_size = 2
>> MPIR_proctable:
>> (i, host, exe, pid) = (0, localhost, ./vhone, 8960)
>> (i, host, exe, pid) = (1, localhost, ./vhone, 8962)
>> [g5dual.3-net:08960] [0,1,0] ompi_mpi_init completed
>> [g5dual.3-net:08962] [0,1,1] ompi_mpi_init completed
>>
>> (Rest ommitted)
>>
>> When xgrid-submitting the job with OpenMPI 1.0.2 there's no Bus error,
>> either:
>>
>> [g5dual:/Network/CFD/MVH-1.0] motte% mpirun -d -np 2 ./vhone
>> [g5dual.3-net:00497] [0,0,0] setting up session dir with
>> [g5dual.3-net:00497] universe default-universe
>> [g5dual.3-net:00497] user motte
>> [g5dual.3-net:00497] host g5dual.3-net
>> [g5dual.3-net:00497] jobid 0
>> [g5dual.3-net:00497] procid 0
>> [g5dual.3-net:00497] procdir:
>> /tmp/openmpi-sessions-motte_at_g5dual.3-net_0/default-universe/0/0
>> [g5dual.3-net:00497] jobdir:
>> /tmp/openmpi-sessions-motte_at_g5dual.3-net_0/default-universe/0
>> [g5dual.3-net:00497] unidir:
>> /tmp/openmpi-sessions-motte_at_g5dual.3-net_0/default-universe
>> [g5dual.3-net:00497] top: openmpi-sessions-motte_at_g5dual.3-net_0
>> [g5dual.3-net:00497] tmp: /tmp
>> [g5dual.3-net:00497] [0,0,0] contact_file
>> /tmp/openmpi-sessions-motte_at_g5dual.3-net_0/default-universe/universe-setup.txt
>> [g5dual.3-net:00497] [0,0,0] wrote setup file
>> [g5dual.3-net:00497] spawn: in job_state_callback(jobid = 1, state = 0x1)
>> [g5dual.3-net:00507] [0,1,1] setting up session dir with
>> [g5dual.3-net:00507] universe default-universe
>> [g5dual.3-net:00507] user nobody
>> [g5dual.3-net:00507] host xgrid-node-1
>> [g5dual.3-net:00507] jobid 1
>> [g5dual.3-net:00507] procid 1
>> [g5dual.3-net:00507] procdir:
>> /tmp/openmpi-sessions-nobody_at_xgrid-node-1_0/default-universe/1/1
>> [g5dual.3-net:00507] jobdir:
>> /tmp/openmpi-sessions-nobody_at_xgrid-node-1_0/default-universe/1
>> [g5dual.3-net:00507] unidir:
>> /tmp/openmpi-sessions-nobody_at_xgrid-node-1_0/default-universe
>> [g5dual.3-net:00507] top: openmpi-sessions-nobody_at_xgrid-node-1_0
>> [g5dual.3-net:00507] tmp: /tmp
>> [g5dual.3-net:00509] [0,1,0] setting up session dir with
>> [g5dual.3-net:00509] universe default-universe
>> [g5dual.3-net:00509] user nobody
>> [g5dual.3-net:00509] host xgrid-node-0
>> [g5dual.3-net:00509] jobid 1
>> [g5dual.3-net:00509] procid 0
>> [g5dual.3-net:00509] procdir:
>> /tmp/openmpi-sessions-nobody_at_xgrid-node-0_0/default-universe/1/0
>> [g5dual.3-net:00509] jobdir:
>> /tmp/openmpi-sessions-nobody_at_xgrid-node-0_0/default-universe/1
>> [g5dual.3-net:00509] unidir:
>> /tmp/openmpi-sessions-nobody_at_xgrid-node-0_0/default-universe
>> [g5dual.3-net:00509] top: openmpi-sessions-nobody_at_xgrid-node-0_0
>> [g5dual.3-net:00509] tmp: /tmp
>> [g5dual.3-net:00497] spawn: in job_state_callback(jobid = 1, state = 0x3)
>> [g5dual.3-net:00497] Info: Setting up debugger process table for
>> applications
>> MPIR_being_debugged = 0
>> MPIR_debug_gate = 0
>> MPIR_debug_state = 1
>> MPIR_acquired_pre_main = 0
>> MPIR_i_am_starter = 0
>> MPIR_proctable_size = 2
>> MPIR_proctable:
>> (i, host, exe, pid) = (0, xgrid-node-1, ./vhone, 507)
>> (i, host, exe, pid) = (1, xgrid-node-0, ./vhone, 509)
>> [g5dual.3-net:00497] spawn: in job_state_callback(jobid = 1, state = 0x4)
>> [g5dual.3-net:00507] [0,1,1] ompi_mpi_init completed
>> [g5dual.3-net:00509] [0,1,0] ompi_mpi_init completed
>>
>> (Rest ommitted)
>>
>> These are the steps I've done in order to upgrade to OpenMPI 1.1:
>>
>> - sudo make uninstall (within the OpenMPI 1.0.2-directory tree)
>> - cd to OpenMPI 1.1-directory tree
>> - ./configure --prefix=/usr --with-xgrid
>> - sudo make all install
>>
>> Maybe the ompi_info-output is of any help:
>>
>> [g5dual:/Network/CFD/MVH-1.0] motte% ompi_info
>> Open MPI: 1.1
>> Open MPI SVN revision: r10477
>> Open RTE: 1.1
>> Open RTE SVN revision: r10477
>> OPAL: 1.1
>> OPAL SVN revision: r10477
>> Prefix: /usr
>> Configured architecture: powerpc-apple-darwin8.7.0
>> Configured by: admin
>> Configured on: Wed Jun 28 10:55:12 CEST 2006
>> Configure host: G5-Dual.local
>> Built by: root
>> Built on: Wed Jun 28 11:32:04 CEST 2006
>> Built host: G5-Dual.local
>> C bindings: yes
>> C++ bindings: yes
>> Fortran77 bindings: yes (single underscore)
>> Fortran90 bindings: yes
>> Fortran90 bindings size: small
>> C compiler: gcc
>> C compiler absolute: /usr/bin/gcc
>> C++ compiler: g++
>> C++ compiler absolute: /usr/bin/g++
>> Fortran77 compiler: gfortran
>> Fortran77 compiler abs: /usr/local/bin/gfortran
>> Fortran90 compiler: gfortran
>> Fortran90 compiler abs: /usr/local/bin/gfortran
>> C profiling: yes
>> C++ profiling: yes
>> Fortran77 profiling: yes
>> Fortran90 profiling: yes
>> C++ exceptions: no
>> Thread support: posix (mpi: no, progress: no)
>> Internal debug support: no
>> MPI parameter check: runtime
>> Memory profiling support: no
>> Memory debugging support: no
>> libltdl support: yes
>> MCA memory: darwin (MCA v1.0, API v1.0, Component v1.1)
>> MCA maffinity: first_use (MCA v1.0, API v1.0, Component v1.1)
>> MCA timer: darwin (MCA v1.0, API v1.0, Component v1.1)
>> MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0)
>> MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0)
>> MCA coll: basic (MCA v1.0, API v1.0, Component v1.1)
>> MCA coll: hierarch (MCA v1.0, API v1.0, Component v1.1)
>> MCA coll: self (MCA v1.0, API v1.0, Component v1.1)
>> MCA coll: sm (MCA v1.0, API v1.0, Component v1.1)
>> MCA coll: tuned (MCA v1.0, API v1.0, Component v1.1)
>> MCA io: romio (MCA v1.0, API v1.0, Component v1.1)
>> MCA mpool: sm (MCA v1.0, API v1.0, Component v1.1)
>> MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.1)
>> MCA bml: r2 (MCA v1.0, API v1.0, Component v1.1)
>> MCA rcache: rb (MCA v1.0, API v1.0, Component v1.1)
>> MCA btl: self (MCA v1.0, API v1.0, Component v1.1)
>> MCA btl: sm (MCA v1.0, API v1.0, Component v1.1)
>> MCA btl: tcp (MCA v1.0, API v1.0, Component v1.0)
>> MCA topo: unity (MCA v1.0, API v1.0, Component v1.1)
>> MCA osc: pt2pt (MCA v1.0, API v1.0, Component v1.0)
>> MCA gpr: null (MCA v1.0, API v1.0, Component v1.1)
>> MCA gpr: proxy (MCA v1.0, API v1.0, Component v1.1)
>> MCA gpr: replica (MCA v1.0, API v1.0, Component v1.1)
>> MCA iof: proxy (MCA v1.0, API v1.0, Component v1.1)
>> MCA iof: svc (MCA v1.0, API v1.0, Component v1.1)
>> MCA ns: proxy (MCA v1.0, API v1.0, Component v1.1)
>> MCA ns: replica (MCA v1.0, API v1.0, Component v1.1)
>> MCA oob: tcp (MCA v1.0, API v1.0, Component v1.0)
>> MCA ras: dash_host (MCA v1.0, API v1.0, Component v1.1)
>> MCA ras: hostfile (MCA v1.0, API v1.0, Component v1.1)
>> MCA ras: localhost (MCA v1.0, API v1.0, Component v1.1)
>> MCA ras: xgrid (MCA v1.0, API v1.0, Component v1.1)
>> MCA rds: hostfile (MCA v1.0, API v1.0, Component v1.1)
>> MCA rds: resfile (MCA v1.0, API v1.0, Component v1.1)
>> MCA rmaps: round_robin (MCA v1.0, API v1.0, Component v1.1)
>> MCA rmgr: proxy (MCA v1.0, API v1.0, Component v1.1)
>> MCA rmgr: urm (MCA v1.0, API v1.0, Component v1.1)
>> MCA rml: oob (MCA v1.0, API v1.0, Component v1.1)
>> MCA pls: fork (MCA v1.0, API v1.0, Component v1.1)
>> MCA pls: rsh (MCA v1.0, API v1.0, Component v1.1)
>> MCA pls: xgrid (MCA v1.0, API v1.0, Component v1.1)
>> MCA sds: env (MCA v1.0, API v1.0, Component v1.1)
>> MCA sds: pipe (MCA v1.0, API v1.0, Component v1.1)
>> MCA sds: seed (MCA v1.0, API v1.0, Component v1.1)
>> MCA sds: singleton (MCA v1.0, API v1.0, Component v1.1)
>>
>> Enclosed you'll find the config.log.
>>
>> Yours,
>> Frank
>>
>>
>> -------------- next part --------------
>> An embedded and charset-unspecified text was scrubbed...
>> Name: config.log
>> Url: http://www.open-mpi.org/MailArchives/users/attachments/20060628/a640acf1/config.pl
>>
>> ------------------------------
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> End of users Digest, Vol 317, Issue 1
>> *************************************
>>
>>
>>
>
>
>
> ------------------------------
>
> Message: 2
> Date: Wed, 28 Jun 2006 13:56:32 +0200
> From: Frank <openmpi-user_at_[hidden]>
> Subject: Re: [OMPI users] OpenMPI 1.1: Signal:10
> info.si_errno:0(Unknown error: 0), si_code:1(BUS_ADRALN) (Frank)
> To: users_at_[hidden]
> Message-ID: <44A26E70.8030903_at_[hidden]>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> Hi!
>
> The very same error occured with openmpi-1.1rc2r10468, too.
>
> Yours,
> Frank
>
> users-request_at_[hidden] wrote:
>
>> Send users mailing list submissions to
>> users_at_[hidden]
>>
>> To subscribe or unsubscribe via the World Wide Web, visit
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> or, via email, send a message with subject or body 'help' to
>> users-request_at_[hidden]
>>
>> You can reach the person managing the list at
>> users-owner_at_[hidden]
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of users digest..."
>>
>>
>> Today's Topics:
>>
>> 1. OpenMPI 1.1: Signal:10 info.si_errno:0(Unknown error: 0)
>> si_code:1(BUS_ADRALN) (Frank)
>>
>>
>> ----------------------------------------------------------------------
>>
>> Message: 1
>> Date: Wed, 28 Jun 2006 12:53:14 +0200
>> From: Frank <openmpi-user_at_[hidden]>
>> Subject: [OMPI users] OpenMPI 1.1: Signal:10 info.si_errno:0(Unknown
>> error: 0) si_code:1(BUS_ADRALN)
>> To: users_at_[hidden]
>> Message-ID: <44A25F9A.3070800_at_[hidden]>
>> Content-Type: text/plain; charset="iso-8859-1"
>>
>> Hi!
>>
>> I've recently updated to OpenMPI 1.1 on a few nodes and running into a
>> problem that wasn't there with OpenMPI 1.0.2.
>>
>> Submitting a job to the XGrid with OpenMPI 1.1 yields a Bus error that
>> isn't there when not submitting the job to the XGrid:
>>
>> [g5dual:/Network/CFD/MVH-1.0] motte% mpirun -d -np 2 ./vhone
>> [g5dual.3-net:08794] [0,0,0] setting up session dir with
>> [g5dual.3-net:08794] universe default-universe
>> [g5dual.3-net:08794] user motte
>> [g5dual.3-net:08794] host g5dual.3-net
>> [g5dual.3-net:08794] jobid 0
>> [g5dual.3-net:08794] procid 0
>> [g5dual.3-net:08794] procdir:
>> /tmp/openmpi-sessions-motte_at_g5dual.3-net_0/default-universe/0/0
>> [g5dual.3-net:08794] jobdir:
>> /tmp/openmpi-sessions-motte_at_g5dual.3-net_0/default-universe/0
>> [g5dual.3-net:08794] unidir:
>> /tmp/openmpi-sessions-motte_at_g5dual.3-net_0/default-universe
>> [g5dual.3-net:08794] top: openmpi-sessions-motte_at_g5dual.3-net_0
>> [g5dual.3-net:08794] tmp: /tmp
>> [g5dual.3-net:08794] [0,0,0] contact_file
>> /tmp/openmpi-sessions-motte_at_g5dual.3-net_0/default-universe/universe-setup.txt
>> [g5dual.3-net:08794] [0,0,0] wrote setup file
>> Signal:10 info.si_errno:0(Unknown error: 0) si_code:1(BUS_ADRALN)
>> Failing at addr:0x10
>> *** End of error message ***
>> Bus error
>> [g5dual:/Network/CFD/MVH-1.0] motte%
>>
>> When not xgrid-submitting the job with OpenMPI 1.1 everything is just fine:
>>
>> [g5dual:/Network/CFD/MVH-1.0] motte% mpirun -d -np 2 ./vhone
>> [g5dual.3-net:08957] procdir: (null)
>> [g5dual.3-net:08957] jobdir: (null)
>> [g5dual.3-net:08957] unidir:
>> /tmp/openmpi-sessions-motte_at_g5dual.3-net_0/default-universe
>> [g5dual.3-net:08957] top: openmpi-sessions-motte_at_g5dual.3-net_0
>> [g5dual.3-net:08957] tmp: /tmp
>> [g5dual.3-net:08957] connect_uni: contact info read
>> [g5dual.3-net:08957] connect_uni: connection not allowed
>> [g5dual.3-net:08957] [0,0,0] setting up session dir with
>> [g5dual.3-net:08957] tmpdir /tmp
>> [g5dual.3-net:08957] universe default-universe-8957
>> [g5dual.3-net:08957] user motte
>> [g5dual.3-net:08957] host g5dual.3-net
>> [g5dual.3-net:08957] jobid 0
>> [g5dual.3-net:08957] procid 0
>> [g5dual.3-net:08957] procdir:
>> /tmp/openmpi-sessions-motte_at_g5dual.3-net_0/default-universe-8957/0/0
>> [g5dual.3-net:08957] jobdir:
>> /tmp/openmpi-sessions-motte_at_g5dual.3-net_0/default-universe-8957/0
>> [g5dual.3-net:08957] unidir:
>> /tmp/openmpi-sessions-motte_at_g5dual.3-net_0/default-universe-8957
>> [g5dual.3-net:08957] top: openmpi-sessions-motte_at_g5dual.3-net_0
>> [g5dual.3-net:08957] tmp: /tmp
>> [g5dual.3-net:08957] [0,0,0] contact_file
>> /tmp/openmpi-sessions-motte_at_g5dual.3-net_0/default-universe-8957/universe-setup.txt
>> [g5dual.3-net:08957] [0,0,0] wrote setup file
>> [g5dual.3-net:08957] pls:rsh: local csh: 1, local bash: 0
>> [g5dual.3-net:08957] pls:rsh: assuming same remote shell as local shell
>> [g5dual.3-net:08957] pls:rsh: remote csh: 1, remote bash: 0
>> [g5dual.3-net:08957] pls:rsh: final template argv:
>> [g5dual.3-net:08957] pls:rsh: /usr/bin/ssh <template> orted --debug
>> --bootproxy 1 --name <template> --num_procs 2 --vpid_start 0 --nodename
>> <template> --universe motte_at_g5dual.3-net:default-universe-8957
>> --nsreplica "0.0.0;tcp://192.168.3.2:49281" --gprreplica
>> "0.0.0;tcp://192.168.3.2:49281" --mpi-call-yield 0
>> [g5dual.3-net:08957] pls:rsh: launching on node localhost
>> [g5dual.3-net:08957] pls:rsh: oversubscribed -- setting
>> mpi_yield_when_idle to 1 (1 2)
>> [g5dual.3-net:08957] pls:rsh: localhost is a LOCAL node
>> [g5dual.3-net:08957] pls:rsh: changing to directory /Users/motte
>> [g5dual.3-net:08957] pls:rsh: executing: orted --debug --bootproxy 1
>> --name 0.0.1 --num_procs 2 --vpid_start 0 --nodename localhost
>> --universe motte_at_g5dual.3-net:default-universe-8957 --nsreplica
>> "0.0.0;tcp://192.168.3.2:49281" --gprreplica
>> "0.0.0;tcp://192.168.3.2:49281" --mpi-call-yield 1
>> [g5dual.3-net:08958] [0,0,1] setting up session dir with
>> [g5dual.3-net:08958] universe default-universe-8957
>> [g5dual.3-net:08958] user motte
>> [g5dual.3-net:08958] host localhost
>> [g5dual.3-net:08958] jobid 0
>> [g5dual.3-net:08958] procid 1
>> [g5dual.3-net:08958] procdir:
>> /tmp/openmpi-sessions-motte_at_localhost_0/default-universe-8957/0/1
>> [g5dual.3-net:08958] jobdir:
>> /tmp/openmpi-sessions-motte_at_localhost_0/default-universe-8957/0
>> [g5dual.3-net:08958] unidir:
>> /tmp/openmpi-sessions-motte_at_localhost_0/default-universe-8957
>> [g5dual.3-net:08958] top: openmpi-sessions-motte_at_localhost_0
>> [g5dual.3-net:08958] tmp: /tmp
>> [g5dual.3-net:08962] [0,1,1] setting up session dir with
>> [g5dual.3-net:08962] universe default-universe-8957
>> [g5dual.3-net:08962] user motte
>> [g5dual.3-net:08962] host localhost
>> [g5dual.3-net:08962] jobid 1
>> [g5dual.3-net:08962] procid 1
>> [g5dual.3-net:08962] procdir:
>> /tmp/openmpi-sessions-motte_at_localhost_0/default-universe-8957/1/1
>> [g5dual.3-net:08962] jobdir:
>> /tmp/openmpi-sessions-motte_at_localhost_0/default-universe-8957/1
>> [g5dual.3-net:08962] unidir:
>> /tmp/openmpi-sessions-motte_at_localhost_0/default-universe-8957
>> [g5dual.3-net:08962] top: openmpi-sessions-motte_at_localhost_0
>> [g5dual.3-net:08962] tmp: /tmp
>> [g5dual.3-net:08960] [0,1,0] setting up session dir with
>> [g5dual.3-net:08960] universe default-universe-8957
>> [g5dual.3-net:08960] user motte
>> [g5dual.3-net:08960] host localhost
>> [g5dual.3-net:08960] jobid 1
>> [g5dual.3-net:08960] procid 0
>> [g5dual.3-net:08960] procdir:
>> /tmp/openmpi-sessions-motte_at_localhost_0/default-universe-8957/1/0
>> [g5dual.3-net:08960] jobdir:
>> /tmp/openmpi-sessions-motte_at_localhost_0/default-universe-8957/1
>> [g5dual.3-net:08960] unidir:
>> /tmp/openmpi-sessions-motte_at_localhost_0/default-universe-8957
>> [g5dual.3-net:08960] top: openmpi-sessions-motte_at_localhost_0
>> [g5dual.3-net:08960] tmp: /tmp
>> [g5dual.3-net:08957] spawn: in job_state_callback(jobid = 1, state = 0x4)
>> [g5dual.3-net:08957] Info: Setting up debugger process table for
>> applications
>> MPIR_being_debugged = 0
>> MPIR_debug_gate = 0
>> MPIR_debug_state = 1
>> MPIR_acquired_pre_main = 0
>> MPIR_i_am_starter = 0
>> MPIR_proctable_size = 2
>> MPIR_proctable:
>> (i, host, exe, pid) = (0, localhost, ./vhone, 8960)
>> (i, host, exe, pid) = (1, localhost, ./vhone, 8962)
>> [g5dual.3-net:08960] [0,1,0] ompi_mpi_init completed
>> [g5dual.3-net:08962] [0,1,1] ompi_mpi_init completed
>>
>> (Rest ommitted)
>>
>> When xgrid-submitting the job with OpenMPI 1.0.2 there's no Bus error,
>> either:
>>
>> [g5dual:/Network/CFD/MVH-1.0] motte% mpirun -d -np 2 ./vhone
>> [g5dual.3-net:00497] [0,0,0] setting up session dir with
>> [g5dual.3-net:00497] universe default-universe
>> [g5dual.3-net:00497] user motte
>> [g5dual.3-net:00497] host g5dual.3-net
>> [g5dual.3-net:00497] jobid 0
>> [g5dual.3-net:00497] procid 0
>> [g5dual.3-net:00497] procdir:
>> /tmp/openmpi-sessions-motte_at_g5dual.3-net_0/default-universe/0/0
>> [g5dual.3-net:00497] jobdir:
>> /tmp/openmpi-sessions-motte_at_g5dual.3-net_0/default-universe/0
>> [g5dual.3-net:00497] unidir:
>> /tmp/openmpi-sessions-motte_at_g5dual.3-net_0/default-universe
>> [g5dual.3-net:00497] top: openmpi-sessions-motte_at_g5dual.3-net_0
>> [g5dual.3-net:00497] tmp: /tmp
>> [g5dual.3-net:00497] [0,0,0] contact_file
>> /tmp/openmpi-sessions-motte_at_g5dual.3-net_0/default-universe/universe-setup.txt
>> [g5dual.3-net:00497] [0,0,0] wrote setup file
>> [g5dual.3-net:00497] spawn: in job_state_callback(jobid = 1, state = 0x1)
>> [g5dual.3-net:00507] [0,1,1] setting up session dir with
>> [g5dual.3-net:00507] universe default-universe
>> [g5dual.3-net:00507] user nobody
>> [g5dual.3-net:00507] host xgrid-node-1
>> [g5dual.3-net:00507] jobid 1
>> [g5dual.3-net:00507] procid 1
>> [g5dual.3-net:00507] procdir:
>> /tmp/openmpi-sessions-nobody_at_xgrid-node-1_0/default-universe/1/1
>> [g5dual.3-net:00507] jobdir:
>> /tmp/openmpi-sessions-nobody_at_xgrid-node-1_0/default-universe/1
>> [g5dual.3-net:00507] unidir:
>> /tmp/openmpi-sessions-nobody_at_xgrid-node-1_0/default-universe
>> [g5dual.3-net:00507] top: openmpi-sessions-nobody_at_xgrid-node-1_0
>> [g5dual.3-net:00507] tmp: /tmp
>> [g5dual.3-net:00509] [0,1,0] setting up session dir with
>> [g5dual.3-net:00509] universe default-universe
>> [g5dual.3-net:00509] user nobody
>> [g5dual.3-net:00509] host xgrid-node-0
>> [g5dual.3-net:00509] jobid 1
>> [g5dual.3-net:00509] procid 0
>> [g5dual.3-net:00509] procdir:
>> /tmp/openmpi-sessions-nobody_at_xgrid-node-0_0/default-universe/1/0
>> [g5dual.3-net:00509] jobdir:
>> /tmp/openmpi-sessions-nobody_at_xgrid-node-0_0/default-universe/1
>> [g5dual.3-net:00509] unidir:
>> /tmp/openmpi-sessions-nobody_at_xgrid-node-0_0/default-universe
>> [g5dual.3-net:00509] top: openmpi-sessions-nobody_at_xgrid-node-0_0
>> [g5dual.3-net:00509] tmp: /tmp
>> [g5dual.3-net:00497] spawn: in job_state_callback(jobid = 1, state = 0x3)
>> [g5dual.3-net:00497] Info: Setting up debugger process table for
>> applications
>> MPIR_being_debugged = 0
>> MPIR_debug_gate = 0
>> MPIR_debug_state = 1
>> MPIR_acquired_pre_main = 0
>> MPIR_i_am_starter = 0
>> MPIR_proctable_size = 2
>> MPIR_proctable:
>> (i, host, exe, pid) = (0, xgrid-node-1, ./vhone, 507)
>> (i, host, exe, pid) = (1, xgrid-node-0, ./vhone, 509)
>> [g5dual.3-net:00497] spawn: in job_state_callback(jobid = 1, state = 0x4)
>> [g5dual.3-net:00507] [0,1,1] ompi_mpi_init completed
>> [g5dual.3-net:00509] [0,1,0] ompi_mpi_init completed
>>
>> (Rest ommitted)
>>
>> These are the steps I've done in order to upgrade to OpenMPI 1.1:
>>
>> - sudo make uninstall (within the OpenMPI 1.0.2-directory tree)
>> - cd to OpenMPI 1.1-directory tree
>> - ./configure --prefix=/usr --with-xgrid
>> - sudo make all install
>>
>> Maybe the ompi_info-output is of any help:
>>
>> [g5dual:/Network/CFD/MVH-1.0] motte% ompi_info
>> Open MPI: 1.1
>> Open MPI SVN revision: r10477
>> Open RTE: 1.1
>> Open RTE SVN revision: r10477
>> OPAL: 1.1
>> OPAL SVN revision: r10477
>> Prefix: /usr
>> Configured architecture: powerpc-apple-darwin8.7.0
>> Configured by: admin
>> Configured on: Wed Jun 28 10:55:12 CEST 2006
>> Configure host: G5-Dual.local
>> Built by: root
>> Built on: Wed Jun 28 11:32:04 CEST 2006
>> Built host: G5-Dual.local
>> C bindings: yes
>> C++ bindings: yes
>> Fortran77 bindings: yes (single underscore)
>> Fortran90 bindings: yes
>> Fortran90 bindings size: small
>> C compiler: gcc
>> C compiler absolute: /usr/bin/gcc
>> C++ compiler: g++
>> C++ compiler absolute: /usr/bin/g++
>> Fortran77 compiler: gfortran
>> Fortran77 compiler abs: /usr/local/bin/gfortran
>> Fortran90 compiler: gfortran
>> Fortran90 compiler abs: /usr/local/bin/gfortran
>> C profiling: yes
>> C++ profiling: yes
>> Fortran77 profiling: yes
>> Fortran90 profiling: yes
>> C++ exceptions: no
>> Thread support: posix (mpi: no, progress: no)
>> Internal debug support: no
>> MPI parameter check: runtime
>> Memory profiling support: no
>> Memory debugging support: no
>> libltdl support: yes
>> MCA memory: darwin (MCA v1.0, API v1.0, Component v1.1)
>> MCA maffinity: first_use (MCA v1.0, API v1.0, Component v1.1)
>> MCA timer: darwin (MCA v1.0, API v1.0, Component v1.1)
>> MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0)
>> MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0)
>> MCA coll: basic (MCA v1.0, API v1.0, Component v1.1)
>> MCA coll: hierarch (MCA v1.0, API v1.0, Component v1.1)
>> MCA coll: self (MCA v1.0, API v1.0, Component v1.1)
>> MCA coll: sm (MCA v1.0, API v1.0, Component v1.1)
>> MCA coll: tuned (MCA v1.0, API v1.0, Component v1.1)
>> MCA io: romio (MCA v1.0, API v1.0, Component v1.1)
>> MCA mpool: sm (MCA v1.0, API v1.0, Component v1.1)
>> MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.1)
>> MCA bml: r2 (MCA v1.0, API v1.0, Component v1.1)
>> MCA rcache: rb (MCA v1.0, API v1.0, Component v1.1)
>> MCA btl: self (MCA v1.0, API v1.0, Component v1.1)
>> MCA btl: sm (MCA v1.0, API v1.0, Component v1.1)
>> MCA btl: tcp (MCA v1.0, API v1.0, Component v1.0)
>> MCA topo: unity (MCA v1.0, API v1.0, Component v1.1)
>> MCA osc: pt2pt (MCA v1.0, API v1.0, Component v1.0)
>> MCA gpr: null (MCA v1.0, API v1.0, Component v1.1)
>> MCA gpr: proxy (MCA v1.0, API v1.0, Component v1.1)
>> MCA gpr: replica (MCA v1.0, API v1.0, Component v1.1)
>> MCA iof: proxy (MCA v1.0, API v1.0, Component v1.1)
>> MCA iof: svc (MCA v1.0, API v1.0, Component v1.1)
>> MCA ns: proxy (MCA v1.0, API v1.0, Component v1.1)
>> MCA ns: replica (MCA v1.0, API v1.0, Component v1.1)
>> MCA oob: tcp (MCA v1.0, API v1.0, Component v1.0)
>> MCA ras: dash_host (MCA v1.0, API v1.0, Component v1.1)
>> MCA ras: hostfile (MCA v1.0, API v1.0, Component v1.1)
>> MCA ras: localhost (MCA v1.0, API v1.0, Component v1.1)
>> MCA ras: xgrid (MCA v1.0, API v1.0, Component v1.1)
>> MCA rds: hostfile (MCA v1.0, API v1.0, Component v1.1)
>> MCA rds: resfile (MCA v1.0, API v1.0, Component v1.1)
>> MCA rmaps: round_robin (MCA v1.0, API v1.0, Component v1.1)
>> MCA rmgr: proxy (MCA v1.0, API v1.0, Component v1.1)
>> MCA rmgr: urm (MCA v1.0, API v1.0, Component v1.1)
>> MCA rml: oob (MCA v1.0, API v1.0, Component v1.1)
>> MCA pls: fork (MCA v1.0, API v1.0, Component v1.1)
>> MCA pls: rsh (MCA v1.0, API v1.0, Component v1.1)
>> MCA pls: xgrid (MCA v1.0, API v1.0, Component v1.1)
>> MCA sds: env (MCA v1.0, API v1.0, Component v1.1)
>> MCA sds: pipe (MCA v1.0, API v1.0, Component v1.1)
>> MCA sds: seed (MCA v1.0, API v1.0, Component v1.1)
>> MCA sds: singleton (MCA v1.0, API v1.0, Component v1.1)
>>
>> Enclosed you'll find the config.log.
>>
>> Yours,
>> Frank
>>
>>
>> -------------- next part --------------
>> An embedded and charset-unspecified text was scrubbed...
>> Name: config.log
>> Url: http://www.open-mpi.org/MailArchives/users/attachments/20060628/a640acf1/config.pl
>>
>> ------------------------------
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> End of users Digest, Vol 317, Issue 1
>> *************************************
>>
>>
>>
>>
>
>
>
> ------------------------------
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> End of users Digest, Vol 317, Issue 2
> *************************************
>
>
>