Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2007-03-22 14:46:36


Yes, if you could recompile with debugging, that would be great.

What launcher are you trying to use?

On Mar 22, 2007, at 2:35 PM, Greg Watson wrote:

> gdb says this:
>
> #0 0x2e342e33 in ?? ()
> #1 0xb7fe1d31 in orte_pls_base_select () from /usr/local/lib/libopen-
> rte.so.0
> #2 0xb7fc50cb in orte_init_stage1 () from /usr/local/lib/libopen-
> rte.so.0
> #3 0xb7fc84be in orte_system_init () from /usr/local/lib/libopen-
> rte.so.0
> #4 0xb7fc4cee in orte_init () from /usr/local/lib/libopen-rte.so.0
> #5 0x08049ecb in orterun (argc=4, argv=0xbffff9f4) at orterun.c:369
> #6 0x08049d7a in main (argc=4, argv=0xbffff9f4) at main.c:13
> (gdb) The program is running. Exit anyway? (y or n) y
>
> I can recompile with debugging if that would be useful. Let me know
> if there's anything else I can do.
>
> Here's ompi_info in case it helps:
>
> Open MPI: 1.2
> Open MPI SVN revision: r14027
> Open RTE: 1.2
> Open RTE SVN revision: r14027
> OPAL: 1.2
> OPAL SVN revision: r14027
> Prefix: /usr/local
> Configured architecture: i686-pc-linux-gnu
> Configured on: Thu Mar 22 13:39:30 EDT 2007
> Built on: Thu Mar 22 13:55:38 EDT 2007
> C bindings: yes
> C++ bindings: yes
> Fortran77 bindings: yes (all)
> Fortran90 bindings: no
> Fortran90 bindings size: na
> C compiler: gcc
> C compiler absolute: /usr/bin/gcc
> C++ compiler: g++
> C++ compiler absolute: /usr/bin/g++
> Fortran77 compiler: g77
> Fortran77 compiler abs: /usr/bin/g77
> Fortran90 compiler: none
> Fortran90 compiler abs: none
> C profiling: yes
> C++ profiling: yes
> Fortran77 profiling: yes
> Fortran90 profiling: no
> C++ exceptions: no
> Thread support: no
> Internal debug support: no
> MPI parameter check: runtime
> Memory profiling support: no
> Memory debugging support: no
> libltdl support: yes
> Heterogeneous support: yes
> mpirun default --prefix: no
> mca: base: component_find: unable to open pml teg: file not found
> (ignored)
> MCA backtrace: execinfo (MCA v1.0, API v1.0, Component
> v1.2)
> MCA memory: ptmalloc2 (MCA v1.0, API v1.0, Component
> v1.2)
> MCA paffinity: linux (MCA v1.0, API v1.0, Component v1.2)
> MCA maffinity: first_use (MCA v1.0, API v1.0, Component
> v1.2)
> MCA timer: linux (MCA v1.0, API v1.0, Component v1.2)
> MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0)
> MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0)
> MCA coll: basic (MCA v1.0, API v1.0, Component v1.2)
> MCA coll: self (MCA v1.0, API v1.0, Component v1.2)
> MCA coll: sm (MCA v1.0, API v1.0, Component v1.2)
> MCA coll: tuned (MCA v1.0, API v1.0, Component v1.2)
> MCA io: romio (MCA v1.0, API v1.0, Component v1.2)
> MCA mpool: sm (MCA v1.0, API v1.0, Component v1.2)
> MCA pml: cm (MCA v1.0, API v1.0, Component v1.2)
> MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.2)
> MCA bml: r2 (MCA v1.0, API v1.0, Component v1.2)
> MCA rcache: rb (MCA v1.0, API v1.0, Component v1.2)
> MCA rcache: vma (MCA v1.0, API v1.0, Component v1.2)
> MCA btl: self (MCA v1.0, API v1.0.1, Component v1.2)
> MCA btl: sm (MCA v1.0, API v1.0.1, Component v1.2)
> MCA btl: tcp (MCA v1.0, API v1.0.1, Component v1.0)
> MCA topo: unity (MCA v1.0, API v1.0, Component v1.2)
> MCA osc: pt2pt (MCA v1.0, API v1.0, Component v1.2)
> MCA errmgr: hnp (MCA v1.0, API v1.3, Component v1.2)
> MCA errmgr: orted (MCA v1.0, API v1.3, Component v1.2)
> MCA errmgr: proxy (MCA v1.0, API v1.3, Component v1.2)
> MCA gpr: null (MCA v1.0, API v1.0, Component v1.2)
> MCA gpr: proxy (MCA v1.0, API v1.0, Component v1.2)
> MCA gpr: replica (MCA v1.0, API v1.0, Component
> v1.2)
> MCA iof: proxy (MCA v1.0, API v1.0, Component v1.2)
> MCA iof: svc (MCA v1.0, API v1.0, Component v1.2)
> MCA ns: proxy (MCA v1.0, API v2.0, Component v1.2)
> MCA ns: replica (MCA v1.0, API v2.0, Component
> v1.2)
> MCA oob: tcp (MCA v1.0, API v1.0, Component v1.0)
> MCA ras: dash_host (MCA v1.0, API v1.3, Component
> v1.2)
> MCA ras: gridengine (MCA v1.0, API v1.3, Component
> v1.2)
> MCA ras: hostfile (MCA v1.0, API v1.0, Component
> v1.0.2)
> MCA ras: localhost (MCA v1.0, API v1.3, Component
> v1.2)
> MCA ras: slurm (MCA v1.0, API v1.3, Component v1.2)
> MCA rds: hostfile (MCA v1.0, API v1.3, Component
> v1.2)
> MCA rds: proxy (MCA v1.0, API v1.3, Component v1.2)
> MCA rds: resfile (MCA v1.0, API v1.3, Component
> v1.2)
> MCA rmaps: round_robin (MCA v1.0, API v1.3, Component
> v1.2)
> MCA rmgr: proxy (MCA v1.0, API v2.0, Component v1.2)
> MCA rmgr: urm (MCA v1.0, API v2.0, Component v1.2)
> MCA rml: oob (MCA v1.0, API v1.0, Component v1.2)
> MCA pls: daemon (MCA v1.0, API v1.0, Component
> v1.0.2)
> MCA pls: fork (MCA v1.0, API v1.0, Component v1.0.2)
> MCA pls: gridengine (MCA v1.0, API v1.3, Component
> v1.2)
> MCA pls: proxy (MCA v1.0, API v1.3, Component v1.2)
> MCA pls: rsh (MCA v1.0, API v1.3, Component v1.2)
> MCA pls: slurm (MCA v1.0, API v1.3, Component v1.2)
> MCA sds: env (MCA v1.0, API v1.0, Component v1.2)
> MCA sds: pipe (MCA v1.0, API v1.0, Component v1.2)
> MCA sds: seed (MCA v1.0, API v1.0, Component v1.2)
> MCA sds: singleton (MCA v1.0, API v1.0, Component
> v1.2)
> MCA sds: slurm (MCA v1.0, API v1.0, Component v1.2)
>
> Greg
>
> On Mar 22, 2007, at 12:29 PM, Jeff Squyres wrote:
>
>> No, not a known problem -- my cluster is RHEL4U4 -- I use it for many
>> thousands of runs of the OMPI v1.2 branch every day...
>>
>> Can you see where it's dying in orte_init_stage1?
>>
>>
>> On Mar 22, 2007, at 2:17 PM, Greg Watson wrote:
>>
>>> Is this a known problem? Building ompi 1.2 on RHEL4:
>>>
>>> ./configure --with-devel-headers --without-threads
>>>
>>> (actually tried without '--without-threads' too, but no change)
>>>
>>> $ mpirun -np 2 test
>>> [beth:06029] *** Process received signal ***
>>> [beth:06029] Signal: Segmentation fault (11)
>>> [beth:06029] Signal code: Address not mapped (1)
>>> [beth:06029] Failing at address: 0x2e342e33
>>> [beth:06029] [ 0] /lib/tls/libc.so.6 [0x21b890]
>>> [beth:06029] [ 1] /usr/local/lib/libopen-rte.so.0(orte_init_stage1
>>> +0x293) [0xb7fc50cb]
>>> [beth:06029] [ 2] /usr/local/lib/libopen-rte.so.0(orte_system_init
>>> +0x1e) [0xb7fc84be]
>>> [beth:06029] [ 3] /usr/local/lib/libopen-rte.so.0(orte_init+0x6a)
>>> [0xb7fc4cee]
>>> [beth:06029] [ 4] mpirun(orterun+0x14b) [0x8049ecb]
>>> [beth:06029] [ 5] mpirun(main+0x2a) [0x8049d7a]
>>> [beth:06029] [ 6] /lib/tls/libc.so.6(__libc_start_main+0xd3)
>>> [0x208de3]
>>> [beth:06029] [ 7] mpirun [0x8049cc9]
>>> [beth:06029] *** End of error message ***
>>> Segmentation fault
>>>
>>> Thanks,
>>>
>>> Greg
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>> --
>> Jeff Squyres
>> Cisco Systems
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

-- 
Jeff Squyres
Cisco Systems