Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] ompi_info hangs
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2008-11-11 10:29:03


Hmm. I'm unable to replicate this error. :-(

Is there any chance that you have some stale OMPI libraries (or OMPI
libraries from any other OMPI version) that are accidentally being
found by ompi_info?

On Nov 10, 2008, at 10:18 PM, Robert Kubrick wrote:

> I rebuilt without the memory manager, now ompi_info crashes with
> this output:
>
> ./configure --prefix=/usr/local/openmpi --disable-mpi-f90 --disable-
> mpi-f77 --without-memory-manager
>
> localhost:~/openmpi> ompi_info
> Open MPI: 1.2.8
> Open MPI SVN revision: r19718
> Open RTE: 1.2.8
> Open RTE SVN revision: r19718
> OPAL: 1.2.8
> OPAL SVN revision: r19718
> Prefix: /usr/local/openmpi
> Configured architecture: x86_64-unknown-linux-gnu
> Configured by: root
> Configured on: Tue Nov 11 04:08:47 CET 2008
> Configure host: localhost
> Built by: root
> Built on: Tue Nov 11 04:13:01 CET 2008
> Built host: localhost
> C bindings: yes
> C++ bindings: yes
> Fortran77 bindings: no
> Fortran90 bindings: no
> Fortran90 bindings size: na
> C compiler: gcc
> C compiler absolute: /usr/bin/gcc
> C++ compiler: g++
> C++ compiler absolute: /usr/bin/g++
> Fortran77 compiler: gfortran
> Fortran77 compiler abs: /usr/bin/gfortran
> Fortran90 compiler: none
> Fortran90 compiler abs: none
> C profiling: yes
> C++ profiling: yes
> Fortran77 profiling: no
> Fortran90 profiling: no
> C++ exceptions: no
> Thread support: posix (mpi: no, progress: no)
> Internal debug support: no
> MPI parameter check: runtime
> Memory profiling support: no
> Memory debugging support: no
> libltdl support: yes
> Heterogeneous support: yes
> mpirun default --prefix: no
> *** glibc detected *** ompi_info: double free or corruption
> (fasttop): 0x00000000006279e0 ***
> ======= Backtrace: =========
> /lib64/libc.so.6[0x2ae688b0621d]
> /lib64/libc.so.6(cfree+0x76)[0x2ae688b07f76]
> /usr/lib64/libstdc++.so.6(_ZNSs6assignERKSs+0x9c)[0x2ae6881b44bc]
> ompi_info(_ZN9ompi_info15open_componentsEv+0x100)[0x405670]
> ompi_info(main+0x11e7)[0x40b837]
> /lib64/libc.so.6(__libc_start_main+0xf4)[0x2ae688ab5b54]
> ompi_info(__gxx_personality_v0+0x121)[0x405249]
> ======= Memory map: ========
> 00400000-0041f000 r-xp 00000000 08:01
> 68989625 /usr/local/openmpi/bin/ompi_info
> 0061e000-0061f000 r--p 0001e000 08:01
> 68989625 /usr/local/openmpi/bin/ompi_info
> 0061f000-00620000 rw-p 0001f000 08:01
> 68989625 /usr/local/openmpi/bin/ompi_info
> 00620000-00642000 rw-p 00620000 00:00
> 0 [heap]
> 2ae687174000-2ae687190000 r-xp 00000000 08:01
> 100681559 /lib64/ld-2.6.1.so
> 2ae687190000-2ae687192000 rw-p 2ae687190000 00:00 0
> 2ae68738f000-2ae687391000 rw-p 0001b000 08:01
> 100681559 /lib64/ld-2.6.1.so
> 2ae687391000-2ae687411000 r-xp 00000000 08:01
> 403422302 /usr/local/openmpi/lib/libmpi.so.0.0.0
> 2ae687411000-2ae687611000 ---p 00080000 08:01
> 403422302 /usr/local/openmpi/lib/libmpi.so.0.0.0
> 2ae687611000-2ae687612000 r--p 00080000 08:01
> 403422302 /usr/local/openmpi/lib/libmpi.so.0.0.0
> 2ae687612000-2ae68761b000 rw-p 00081000 08:01
> 403422302 /usr/local/openmpi/lib/libmpi.so.0.0.0
> 2ae68761b000-2ae687622000 rw-p 2ae68761b000 00:00 0
> 2ae687622000-2ae68767a000 r-xp 00000000 08:01
> 403422294 /usr/local/openmpi/lib/libopen-rte.so.0.0.0
> 2ae68767a000-2ae68787a000 ---p 00058000 08:01
> 403422294 /usr/local/openmpi/lib/libopen-rte.so.0.0.0
> 2ae68787a000-2ae68787b000 r--p 00058000 08:01
> 403422294 /usr/local/openmpi/lib/libopen-rte.so.0.0.0
> 2ae68787b000-2ae68787d000 rw-p 00059000 08:01
> 403422294 /usr/local/openmpi/lib/libopen-rte.so.0.0.0
> 2ae68787d000-2ae68787e000 rw-p 2ae68787d000 00:00 0
> 2ae68787e000-2ae6878b1000 r-xp 00000000 08:01
> 403422290 /usr/local/openmpi/lib/libopen-pal.so.0.0.0
> 2ae6878b1000-2ae687ab0000 ---p 00033000 08:01
> 403422290 /usr/local/openmpi/lib/libopen-pal.so.0.0.0
> 2ae687ab0000-2ae687ab1000 r--p 00032000 08:01
> 403422290 /usr/local/openmpi/lib/libopen-pal.so.0.0.0
> 2ae687ab1000-2ae687ab3000 rw-p 00033000 08:01
> 403422290 /usr/local/openmpi/lib/libopen-pal.so.0.0.0
> 2ae687ab3000-2ae687ad5000 rw-p 2ae687ab3000 00:00 0
> 2ae687af3000-2ae687af5000 r-xp 00000000 08:01
> 100681700 /lib64/libdl-2.6.1.so
> 2ae687af5000-2ae687cf5000 ---p 00002000 08:01
> 100681700 /lib64/libdl-2.6.1.so
> 2ae687cf5000-2ae687cf7000 rw-p 00002000 08:01
> 100681700 /lib64/libdl-2.6.1.so
> 2ae687cf7000-2ae687cf8000 rw-p 2ae687cf7000 00:00 0
> 2ae687cf8000-2ae687d0c000 r-xp 00000000 08:01
> 100681705 /lib64/libnsl-2.6.1.so
> 2ae687d0c000-2ae687f0b000 ---p 00014000 08:01
> 100681705 /lib64/libnsl-2.6.1.so
> 2ae687f0b000-2ae687f0d000 rw-p 00013000 08:01
> 100681705 /lib64/libnsl-2.6.1.so
> 2ae687f0d000-2ae687f0f000 rw-p 2ae687f0d000 00:00 0
> 2ae687f0f000-2ae687f11000 r-xp 00000000 08:01
> 100681728 /lib64/libutil-2.6.1.so
> 2ae687f11000-2ae688110000 ---p 00002000 08:01
> 100681728 /lib64/libutil-2.6.1.so
> 2ae688110000-2ae688112000 rw-p 00001000 08:01
> 100681728 /lib64/libutil-2.6.1.so
> 2ae688112000-2ae6881fe000 r-xp 00000000 08:01
> 67350662 /usr/lib64/libstdc++.so.6.0.9
> 2ae6881fe000-2ae6883fe000 ---p 000ec000 08:01
> 67350662 /usr/lib64/libstdc++.so.6.0.9
> 2ae6883fe000-2ae688404000 r--p 000ec000 08:01
> 67350662 /usr/lib64/libstdc++.so.6.0.9
> 2ae688404000-2ae688407000 rw-p 000f2000 08:01
> 67350662 /usr/lib64/libstdc++.so.6.0.9
> 2ae688407000-2ae68841b000 rw-p 2ae688407000 00:00 0
> 2ae68841b000-2ae68846d000 r-xp 00000000 08:01
> 100681702 /lib64/libm-2.6.1.so
> 2ae68846d000-2ae68866c000 ---p 00052000 08:01
> 100681702 /lib64/libm-2.6.1.so
> 2ae68866c000-2ae68866e000 rw-p 00051000 08:01
> 100681702 /lib64/libm-2.6.1.so
> 2ae68866e000-2ae68867b000 r-xp 00000000 08:01
> 100845329 /lib64/libgcc_s.so.1
> 2ae68867b000-2ae68887a000 ---p 0000d000 08:01
> 100845329 /lib64/libgcc_s.so.1
> 2ae68887a000-2ae68887c000 rw-p 0000c000 08:01
> 100845329 /lib64/libgcc_s.so.1
> 2ae68887c000-2ae688891000 r-xp 00000000 08:01
> 100681720 /lib64/libpthread-2.6.1.so
> 2ae688891000-2ae688a91000 ---p 00015000 08:01
> 100681720 /lib64/libpthread-2.6.1.so
> 2ae688a91000-2ae688a93000 rw-p 00015000 08:01
> 100681720 /lib64/libpthread-2.6.1.so
> 2ae688a93000-2ae688a98000 rw-p 2ae688a93000 00:00 0
> 2ae688a98000-2ae688bd4000 r-xp 00000000 08:01
> 100681566 /lib64/libc-2.6.1.so
> 2ae688bd4000-2ae688dd4000 ---p 0013c000 08:01
> 100681566 /lib64/libc-2.6.1.so
> 2ae688dd4000-2ae688dd7000 r--p 0013c000 08:01
> 100681566 /lib64/libc-2.6.1.so
> 2ae688dd7000-2ae688dd9000 rw-p 0013f000 08:01
> 100681566 /lib64/libc-2.6.1.so
> 2ae688dd9000-2ae688de0000 rw-p 2ae688dd9000 00:00 0
> 2ae68c000000-2ae68c021000 rw-p 2ae68c000000 00:00 0
> 2ae68c021000-2ae690000000 ---p 2ae68c021000 00:00 0
> 7fff23921000-7fff23936000 rw-p 7fff23921000 00:00
> 0 [stack]
> ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00
> 0 [vdso]
> [localhost] *** Process received signal ***
> [localhost] Signal: Aborted (6)
> [localhost] Signal code: (-6)
> [localhost] [ 0] /lib64/libpthread.so.0 [0x2ae688889fb0]
> [localhost] [ 1] /lib64/libc.so.6(gsignal+0x35) [0x2ae688ac8b45]
> [localhost] [ 2] /lib64/libc.so.6(abort+0x110) [0x2ae688aca0e0]
> [localhost] [ 3] /lib64/libc.so.6 [0x2ae688b00fbb]
> [localhost] [ 4] /lib64/libc.so.6 [0x2ae688b0621d]
> [localhost] [ 5] /lib64/libc.so.6(cfree+0x76) [0x2ae688b07f76]
> [localhost] [ 6] /usr/lib64/libstdc++.so.6(_ZNSs6assignERKSs+0x9c)
> [0x2ae6881b44bc]
> [localhost] [ 7] ompi_info(_ZN9ompi_info15open_componentsEv+0x100)
> [0x405670]
> [localhost] [ 8] ompi_info(main+0x11e7) [0x40b837]
> [localhost] [ 9] /lib64/libc.so.6(__libc_start_main+0xf4)
> [0x2ae688ab5b54]
> [localhost] [10] ompi_info(__gxx_personality_v0+0x121) [0x405249]
> [localhost] *** End of error message ***
> Aborted
>
> localhost:~/archives/openmpi-1.2.8> g++ -v
> Using built-in specs.
> Target: x86_64-suse-linux
> Configured with: ../configure --enable-threads=posix --prefix=/usr --
> with-local-prefix=/usr/local --infodir=/usr/share/info --mandir=/usr/
> share/man --libdir=/usr/lib64 --libexecdir=/usr/lib64 --enable-
> languages=c,c++,objc,fortran,obj-c++,java,ada --enable-
> checking=release --with-gxx-include-dir=/usr/include/c++/4.2.1 --
> enable-ssp --disable-libssp --disable-libgcj --with-slibdir=/lib64 --
> with-system-zlib --enable-shared --enable-__cxa_atexit --enable-
> libstdcxx-allocator=new --disable-libstdcxx-pch --program-
> suffix=-4.2 --enable-version-specific-runtime-libs --without-system-
> libunwind --with-cpu=generic --host=x86_64-suse-linux
> Thread model: posix
> gcc version 4.2.1 (SUSE Linux)
>
>
> On Nov 10, 2008, at 1:44 PM, Jeff Squyres wrote:
>
>> If you're not using OpenFabrics-based networks, try configuring
>> Open MPI --without-memory-manager and see if that fixes your
>> problems.
>>
>>
>> On Nov 8, 2008, at 5:31 PM, Robert Kubrick wrote:
>>
>>> George, I have warning when running under debugger 'Lowest section
>>> in system-supplied DSO at 0xffffe000 is .hash at ffffe0b4'
>>> The program hangs in _int_malloc():
>>>
>>> (gdb) run
>>> Starting program: /opt/openmpi-1.2.7/bin/ompi_info
>>> warning: Lowest section in system-supplied DSO at 0xffffe000
>>> is .hash at ffffe0b4
>>> [Thread debugging using libthread_db enabled]
>>> [New Thread 0xf7b7d6d0 (LWP 16621)]
>>> 1.2.7
>>>
>>> Program received signal SIGINT, Interrupt.
>>> [Switching to Thread 0xf7b7d6d0 (LWP 16621)]
>>> 0xf7e5267e in _int_malloc () from /opt/openmpi/lib/libopen-pal.so.0
>>> (gdb) where
>>> #0 0xf7e5267e in _int_malloc () from /opt/openmpi/lib/libopen-
>>> pal.so.0
>>> #1 0xf7e544e1 in malloc () from /opt/openmpi/lib/libopen-pal.so.0
>>> #2 0xf7db46c7 in operator new () from /usr/lib/libstdc++.so.6
>>> #3 0xf7d8e121 in std::string::_Rep::_S_create () from /usr/lib/
>>> libstdc++.so.6
>>> #4 0xf7d8ee18 in std::string::_Rep::_M_clone () from /usr/lib/
>>> libstdc++.so.6
>>> #5 0xf7d8fac8 in std::string::reserve () from /usr/lib/libstdc+
>>> +.so.6
>>> #6 0xf7d8ff6a in std::string::append () from /usr/lib/libstdc+
>>> +.so.6
>>> #7 0x08054f30 in ompi_info::out ()
>>> #8 0x08062a33 in ompi_info::show_ompi_version ()
>>> #9 0x080533a0 in main ()
>>>
>>> On Nov 8, 2008, at 12:33 PM, George Bosilca wrote:
>>>
>>>> I think we had a similar problem on the past. It has something to
>>>> do with the atomics on this architecture.
>>>>
>>>> I don't have access to such an architecture. Can you provide us a
>>>> stack trace when this happens ?
>>>>
>>>> Thanks,
>>>> george.
>>>>
>>>> On Nov 8, 2008, at 12:14 PM, Robert Kubrick wrote:
>>>>
>>>>> I am having problems building OMPI 1.2.7 on an Intel Xeon quad-
>>>>> core 64 bits server. The compilation completes but ompi_info
>>>>> hangs after printing the OMPI version:
>>>>>
>>>>> # ompi_info
>>>>> 1.2.7
>>>>>
>>>>> I tried to run a few mpi applications on this same install and
>>>>> they do work fine. What can cause ompi_info to hang?
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> --
>> Jeff Squyres
>> Cisco Systems
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
Cisco Systems