Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2007-03-22 15:28:57


Whew! You had me worried there for a minute... :-)

On Mar 22, 2007, at 3:15 PM, Greg Watson wrote:

> Scratch that. The problem was an installation over an old copy of
> ompi. Obviously picking up some old stuff.
>
> Sorry for the disturbance. Back to the bat cave...
>
> Greg
>
> On Mar 22, 2007, at 12:46 PM, Jeff Squyres wrote:
>
>> Yes, if you could recompile with debugging, that would be great.
>>
>> What launcher are you trying to use?
>>
>>
>> On Mar 22, 2007, at 2:35 PM, Greg Watson wrote:
>>
>>> gdb says this:
>>>
>>> #0 0x2e342e33 in ?? ()
>>> #1 0xb7fe1d31 in orte_pls_base_select () from /usr/local/lib/
>>> libopen-
>>> rte.so.0
>>> #2 0xb7fc50cb in orte_init_stage1 () from /usr/local/lib/libopen-
>>> rte.so.0
>>> #3 0xb7fc84be in orte_system_init () from /usr/local/lib/libopen-
>>> rte.so.0
>>> #4 0xb7fc4cee in orte_init () from /usr/local/lib/libopen-rte.so.0
>>> #5 0x08049ecb in orterun (argc=4, argv=0xbffff9f4) at orterun.c:369
>>> #6 0x08049d7a in main (argc=4, argv=0xbffff9f4) at main.c:13
>>> (gdb) The program is running. Exit anyway? (y or n) y
>>>
>>> I can recompile with debugging if that would be useful. Let me know
>>> if there's anything else I can do.
>>>
>>> Here's ompi_info in case it helps:
>>>
>>> Open MPI: 1.2
>>> Open MPI SVN revision: r14027
>>> Open RTE: 1.2
>>> Open RTE SVN revision: r14027
>>> OPAL: 1.2
>>> OPAL SVN revision: r14027
>>> Prefix: /usr/local
>>> Configured architecture: i686-pc-linux-gnu
>>> Configured on: Thu Mar 22 13:39:30 EDT 2007
>>> Built on: Thu Mar 22 13:55:38 EDT 2007
>>> C bindings: yes
>>> C++ bindings: yes
>>> Fortran77 bindings: yes (all)
>>> Fortran90 bindings: no
>>> Fortran90 bindings size: na
>>> C compiler: gcc
>>> C compiler absolute: /usr/bin/gcc
>>> C++ compiler: g++
>>> C++ compiler absolute: /usr/bin/g++
>>> Fortran77 compiler: g77
>>> Fortran77 compiler abs: /usr/bin/g77
>>> Fortran90 compiler: none
>>> Fortran90 compiler abs: none
>>> C profiling: yes
>>> C++ profiling: yes
>>> Fortran77 profiling: yes
>>> Fortran90 profiling: no
>>> C++ exceptions: no
>>> Thread support: no
>>> Internal debug support: no
>>> MPI parameter check: runtime
>>> Memory profiling support: no
>>> Memory debugging support: no
>>> libltdl support: yes
>>> Heterogeneous support: yes
>>> mpirun default --prefix: no
>>> mca: base: component_find: unable to open pml teg: file not found
>>> (ignored)
>>> MCA backtrace: execinfo (MCA v1.0, API v1.0, Component
>>> v1.2)
>>> MCA memory: ptmalloc2 (MCA v1.0, API v1.0, Component
>>> v1.2)
>>> MCA paffinity: linux (MCA v1.0, API v1.0, Component
>>> v1.2)
>>> MCA maffinity: first_use (MCA v1.0, API v1.0, Component
>>> v1.2)
>>> MCA timer: linux (MCA v1.0, API v1.0, Component
>>> v1.2)
>>> MCA allocator: basic (MCA v1.0, API v1.0, Component
>>> v1.0)
>>> MCA allocator: bucket (MCA v1.0, API v1.0, Component
>>> v1.0)
>>> MCA coll: basic (MCA v1.0, API v1.0, Component
>>> v1.2)
>>> MCA coll: self (MCA v1.0, API v1.0, Component v1.2)
>>> MCA coll: sm (MCA v1.0, API v1.0, Component v1.2)
>>> MCA coll: tuned (MCA v1.0, API v1.0, Component
>>> v1.2)
>>> MCA io: romio (MCA v1.0, API v1.0, Component
>>> v1.2)
>>> MCA mpool: sm (MCA v1.0, API v1.0, Component v1.2)
>>> MCA pml: cm (MCA v1.0, API v1.0, Component v1.2)
>>> MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.2)
>>> MCA bml: r2 (MCA v1.0, API v1.0, Component v1.2)
>>> MCA rcache: rb (MCA v1.0, API v1.0, Component v1.2)
>>> MCA rcache: vma (MCA v1.0, API v1.0, Component v1.2)
>>> MCA btl: self (MCA v1.0, API v1.0.1, Component
>>> v1.2)
>>> MCA btl: sm (MCA v1.0, API v1.0.1, Component v1.2)
>>> MCA btl: tcp (MCA v1.0, API v1.0.1, Component
>>> v1.0)
>>> MCA topo: unity (MCA v1.0, API v1.0, Component
>>> v1.2)
>>> MCA osc: pt2pt (MCA v1.0, API v1.0, Component
>>> v1.2)
>>> MCA errmgr: hnp (MCA v1.0, API v1.3, Component v1.2)
>>> MCA errmgr: orted (MCA v1.0, API v1.3, Component
>>> v1.2)
>>> MCA errmgr: proxy (MCA v1.0, API v1.3, Component
>>> v1.2)
>>> MCA gpr: null (MCA v1.0, API v1.0, Component v1.2)
>>> MCA gpr: proxy (MCA v1.0, API v1.0, Component
>>> v1.2)
>>> MCA gpr: replica (MCA v1.0, API v1.0, Component
>>> v1.2)
>>> MCA iof: proxy (MCA v1.0, API v1.0, Component
>>> v1.2)
>>> MCA iof: svc (MCA v1.0, API v1.0, Component v1.2)
>>> MCA ns: proxy (MCA v1.0, API v2.0, Component
>>> v1.2)
>>> MCA ns: replica (MCA v1.0, API v2.0, Component
>>> v1.2)
>>> MCA oob: tcp (MCA v1.0, API v1.0, Component v1.0)
>>> MCA ras: dash_host (MCA v1.0, API v1.3, Component
>>> v1.2)
>>> MCA ras: gridengine (MCA v1.0, API v1.3, Component
>>> v1.2)
>>> MCA ras: hostfile (MCA v1.0, API v1.0, Component
>>> v1.0.2)
>>> MCA ras: localhost (MCA v1.0, API v1.3, Component
>>> v1.2)
>>> MCA ras: slurm (MCA v1.0, API v1.3, Component
>>> v1.2)
>>> MCA rds: hostfile (MCA v1.0, API v1.3, Component
>>> v1.2)
>>> MCA rds: proxy (MCA v1.0, API v1.3, Component
>>> v1.2)
>>> MCA rds: resfile (MCA v1.0, API v1.3, Component
>>> v1.2)
>>> MCA rmaps: round_robin (MCA v1.0, API v1.3,
>>> Component
>>> v1.2)
>>> MCA rmgr: proxy (MCA v1.0, API v2.0, Component
>>> v1.2)
>>> MCA rmgr: urm (MCA v1.0, API v2.0, Component v1.2)
>>> MCA rml: oob (MCA v1.0, API v1.0, Component v1.2)
>>> MCA pls: daemon (MCA v1.0, API v1.0, Component
>>> v1.0.2)
>>> MCA pls: fork (MCA v1.0, API v1.0, Component
>>> v1.0.2)
>>> MCA pls: gridengine (MCA v1.0, API v1.3, Component
>>> v1.2)
>>> MCA pls: proxy (MCA v1.0, API v1.3, Component
>>> v1.2)
>>> MCA pls: rsh (MCA v1.0, API v1.3, Component v1.2)
>>> MCA pls: slurm (MCA v1.0, API v1.3, Component
>>> v1.2)
>>> MCA sds: env (MCA v1.0, API v1.0, Component v1.2)
>>> MCA sds: pipe (MCA v1.0, API v1.0, Component v1.2)
>>> MCA sds: seed (MCA v1.0, API v1.0, Component v1.2)
>>> MCA sds: singleton (MCA v1.0, API v1.0, Component
>>> v1.2)
>>> MCA sds: slurm (MCA v1.0, API v1.0, Component
>>> v1.2)
>>>
>>> Greg
>>>
>>> On Mar 22, 2007, at 12:29 PM, Jeff Squyres wrote:
>>>
>>>> No, not a known problem -- my cluster is RHEL4U4 -- I use it for
>>>> many
>>>> thousands of runs of the OMPI v1.2 branch every day...
>>>>
>>>> Can you see where it's dying in orte_init_stage1?
>>>>
>>>>
>>>> On Mar 22, 2007, at 2:17 PM, Greg Watson wrote:
>>>>
>>>>> Is this a known problem? Building ompi 1.2 on RHEL4:
>>>>>
>>>>> ./configure --with-devel-headers --without-threads
>>>>>
>>>>> (actually tried without '--without-threads' too, but no change)
>>>>>
>>>>> $ mpirun -np 2 test
>>>>> [beth:06029] *** Process received signal ***
>>>>> [beth:06029] Signal: Segmentation fault (11)
>>>>> [beth:06029] Signal code: Address not mapped (1)
>>>>> [beth:06029] Failing at address: 0x2e342e33
>>>>> [beth:06029] [ 0] /lib/tls/libc.so.6 [0x21b890]
>>>>> [beth:06029] [ 1] /usr/local/lib/libopen-rte.so.0(orte_init_stage1
>>>>> +0x293) [0xb7fc50cb]
>>>>> [beth:06029] [ 2] /usr/local/lib/libopen-rte.so.0(orte_system_init
>>>>> +0x1e) [0xb7fc84be]
>>>>> [beth:06029] [ 3] /usr/local/lib/libopen-rte.so.0(orte_init+0x6a)
>>>>> [0xb7fc4cee]
>>>>> [beth:06029] [ 4] mpirun(orterun+0x14b) [0x8049ecb]
>>>>> [beth:06029] [ 5] mpirun(main+0x2a) [0x8049d7a]
>>>>> [beth:06029] [ 6] /lib/tls/libc.so.6(__libc_start_main+0xd3)
>>>>> [0x208de3]
>>>>> [beth:06029] [ 7] mpirun [0x8049cc9]
>>>>> [beth:06029] *** End of error message ***
>>>>> Segmentation fault
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Greg
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> devel_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>
>>>>
>>>> --
>>>> Jeff Squyres
>>>> Cisco Systems
>>>>
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>> --
>> Jeff Squyres
>> Cisco Systems
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

-- 
Jeff Squyres
Cisco Systems