Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] ompi_info hangs
From: Robert Kubrick (robertkubrick_at_[hidden])
Date: 2008-11-10 22:18:58


I rebuilt without the memory manager, now ompi_info crashes with this
output:

./configure --prefix=/usr/local/openmpi --disable-mpi-f90 --disable-
mpi-f77 --without-memory-manager

localhost:~/openmpi> ompi_info
                 Open MPI: 1.2.8
    Open MPI SVN revision: r19718
                 Open RTE: 1.2.8
    Open RTE SVN revision: r19718
                     OPAL: 1.2.8
        OPAL SVN revision: r19718
                   Prefix: /usr/local/openmpi
  Configured architecture: x86_64-unknown-linux-gnu
            Configured by: root
            Configured on: Tue Nov 11 04:08:47 CET 2008
           Configure host: localhost
                 Built by: root
                 Built on: Tue Nov 11 04:13:01 CET 2008
               Built host: localhost
               C bindings: yes
             C++ bindings: yes
       Fortran77 bindings: no
       Fortran90 bindings: no
  Fortran90 bindings size: na
               C compiler: gcc
      C compiler absolute: /usr/bin/gcc
             C++ compiler: g++
    C++ compiler absolute: /usr/bin/g++
       Fortran77 compiler: gfortran
   Fortran77 compiler abs: /usr/bin/gfortran
       Fortran90 compiler: none
   Fortran90 compiler abs: none
              C profiling: yes
            C++ profiling: yes
      Fortran77 profiling: no
      Fortran90 profiling: no
           C++ exceptions: no
           Thread support: posix (mpi: no, progress: no)
   Internal debug support: no
      MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
          libltdl support: yes
    Heterogeneous support: yes
  mpirun default --prefix: no
*** glibc detected *** ompi_info: double free or corruption
(fasttop): 0x00000000006279e0 ***
======= Backtrace: =========
/lib64/libc.so.6[0x2ae688b0621d]
/lib64/libc.so.6(cfree+0x76)[0x2ae688b07f76]
/usr/lib64/libstdc++.so.6(_ZNSs6assignERKSs+0x9c)[0x2ae6881b44bc]
ompi_info(_ZN9ompi_info15open_componentsEv+0x100)[0x405670]
ompi_info(main+0x11e7)[0x40b837]
/lib64/libc.so.6(__libc_start_main+0xf4)[0x2ae688ab5b54]
ompi_info(__gxx_personality_v0+0x121)[0x405249]
======= Memory map: ========
00400000-0041f000 r-xp 00000000 08:01
68989625 /usr/local/openmpi/bin/ompi_info
0061e000-0061f000 r--p 0001e000 08:01
68989625 /usr/local/openmpi/bin/ompi_info
0061f000-00620000 rw-p 0001f000 08:01
68989625 /usr/local/openmpi/bin/ompi_info
00620000-00642000 rw-p 00620000 00:00
0 [heap]
2ae687174000-2ae687190000 r-xp 00000000 08:01
100681559 /lib64/ld-2.6.1.so
2ae687190000-2ae687192000 rw-p 2ae687190000 00:00 0
2ae68738f000-2ae687391000 rw-p 0001b000 08:01
100681559 /lib64/ld-2.6.1.so
2ae687391000-2ae687411000 r-xp 00000000 08:01
403422302 /usr/local/openmpi/lib/libmpi.so.0.0.0
2ae687411000-2ae687611000 ---p 00080000 08:01
403422302 /usr/local/openmpi/lib/libmpi.so.0.0.0
2ae687611000-2ae687612000 r--p 00080000 08:01
403422302 /usr/local/openmpi/lib/libmpi.so.0.0.0
2ae687612000-2ae68761b000 rw-p 00081000 08:01
403422302 /usr/local/openmpi/lib/libmpi.so.0.0.0
2ae68761b000-2ae687622000 rw-p 2ae68761b000 00:00 0
2ae687622000-2ae68767a000 r-xp 00000000 08:01
403422294 /usr/local/openmpi/lib/libopen-rte.so.0.0.0
2ae68767a000-2ae68787a000 ---p 00058000 08:01
403422294 /usr/local/openmpi/lib/libopen-rte.so.0.0.0
2ae68787a000-2ae68787b000 r--p 00058000 08:01
403422294 /usr/local/openmpi/lib/libopen-rte.so.0.0.0
2ae68787b000-2ae68787d000 rw-p 00059000 08:01
403422294 /usr/local/openmpi/lib/libopen-rte.so.0.0.0
2ae68787d000-2ae68787e000 rw-p 2ae68787d000 00:00 0
2ae68787e000-2ae6878b1000 r-xp 00000000 08:01
403422290 /usr/local/openmpi/lib/libopen-pal.so.0.0.0
2ae6878b1000-2ae687ab0000 ---p 00033000 08:01
403422290 /usr/local/openmpi/lib/libopen-pal.so.0.0.0
2ae687ab0000-2ae687ab1000 r--p 00032000 08:01
403422290 /usr/local/openmpi/lib/libopen-pal.so.0.0.0
2ae687ab1000-2ae687ab3000 rw-p 00033000 08:01
403422290 /usr/local/openmpi/lib/libopen-pal.so.0.0.0
2ae687ab3000-2ae687ad5000 rw-p 2ae687ab3000 00:00 0
2ae687af3000-2ae687af5000 r-xp 00000000 08:01
100681700 /lib64/libdl-2.6.1.so
2ae687af5000-2ae687cf5000 ---p 00002000 08:01
100681700 /lib64/libdl-2.6.1.so
2ae687cf5000-2ae687cf7000 rw-p 00002000 08:01
100681700 /lib64/libdl-2.6.1.so
2ae687cf7000-2ae687cf8000 rw-p 2ae687cf7000 00:00 0
2ae687cf8000-2ae687d0c000 r-xp 00000000 08:01
100681705 /lib64/libnsl-2.6.1.so
2ae687d0c000-2ae687f0b000 ---p 00014000 08:01
100681705 /lib64/libnsl-2.6.1.so
2ae687f0b000-2ae687f0d000 rw-p 00013000 08:01
100681705 /lib64/libnsl-2.6.1.so
2ae687f0d000-2ae687f0f000 rw-p 2ae687f0d000 00:00 0
2ae687f0f000-2ae687f11000 r-xp 00000000 08:01
100681728 /lib64/libutil-2.6.1.so
2ae687f11000-2ae688110000 ---p 00002000 08:01
100681728 /lib64/libutil-2.6.1.so
2ae688110000-2ae688112000 rw-p 00001000 08:01
100681728 /lib64/libutil-2.6.1.so
2ae688112000-2ae6881fe000 r-xp 00000000 08:01
67350662 /usr/lib64/libstdc++.so.6.0.9
2ae6881fe000-2ae6883fe000 ---p 000ec000 08:01
67350662 /usr/lib64/libstdc++.so.6.0.9
2ae6883fe000-2ae688404000 r--p 000ec000 08:01
67350662 /usr/lib64/libstdc++.so.6.0.9
2ae688404000-2ae688407000 rw-p 000f2000 08:01
67350662 /usr/lib64/libstdc++.so.6.0.9
2ae688407000-2ae68841b000 rw-p 2ae688407000 00:00 0
2ae68841b000-2ae68846d000 r-xp 00000000 08:01
100681702 /lib64/libm-2.6.1.so
2ae68846d000-2ae68866c000 ---p 00052000 08:01
100681702 /lib64/libm-2.6.1.so
2ae68866c000-2ae68866e000 rw-p 00051000 08:01
100681702 /lib64/libm-2.6.1.so
2ae68866e000-2ae68867b000 r-xp 00000000 08:01
100845329 /lib64/libgcc_s.so.1
2ae68867b000-2ae68887a000 ---p 0000d000 08:01
100845329 /lib64/libgcc_s.so.1
2ae68887a000-2ae68887c000 rw-p 0000c000 08:01
100845329 /lib64/libgcc_s.so.1
2ae68887c000-2ae688891000 r-xp 00000000 08:01
100681720 /lib64/libpthread-2.6.1.so
2ae688891000-2ae688a91000 ---p 00015000 08:01
100681720 /lib64/libpthread-2.6.1.so
2ae688a91000-2ae688a93000 rw-p 00015000 08:01
100681720 /lib64/libpthread-2.6.1.so
2ae688a93000-2ae688a98000 rw-p 2ae688a93000 00:00 0
2ae688a98000-2ae688bd4000 r-xp 00000000 08:01
100681566 /lib64/libc-2.6.1.so
2ae688bd4000-2ae688dd4000 ---p 0013c000 08:01
100681566 /lib64/libc-2.6.1.so
2ae688dd4000-2ae688dd7000 r--p 0013c000 08:01
100681566 /lib64/libc-2.6.1.so
2ae688dd7000-2ae688dd9000 rw-p 0013f000 08:01
100681566 /lib64/libc-2.6.1.so
2ae688dd9000-2ae688de0000 rw-p 2ae688dd9000 00:00 0
2ae68c000000-2ae68c021000 rw-p 2ae68c000000 00:00 0
2ae68c021000-2ae690000000 ---p 2ae68c021000 00:00 0
7fff23921000-7fff23936000 rw-p 7fff23921000 00:00
0 [stack]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00
0 [vdso]
[localhost] *** Process received signal ***
[localhost] Signal: Aborted (6)
[localhost] Signal code: (-6)
[localhost] [ 0] /lib64/libpthread.so.0 [0x2ae688889fb0]
[localhost] [ 1] /lib64/libc.so.6(gsignal+0x35) [0x2ae688ac8b45]
[localhost] [ 2] /lib64/libc.so.6(abort+0x110) [0x2ae688aca0e0]
[localhost] [ 3] /lib64/libc.so.6 [0x2ae688b00fbb]
[localhost] [ 4] /lib64/libc.so.6 [0x2ae688b0621d]
[localhost] [ 5] /lib64/libc.so.6(cfree+0x76) [0x2ae688b07f76]
[localhost] [ 6] /usr/lib64/libstdc++.so.6(_ZNSs6assignERKSs+0x9c)
[0x2ae6881b44bc]
[localhost] [ 7] ompi_info(_ZN9ompi_info15open_componentsEv+0x100)
[0x405670]
[localhost] [ 8] ompi_info(main+0x11e7) [0x40b837]
[localhost] [ 9] /lib64/libc.so.6(__libc_start_main+0xf4)
[0x2ae688ab5b54]
[localhost] [10] ompi_info(__gxx_personality_v0+0x121) [0x405249]
[localhost] *** End of error message ***
Aborted

localhost:~/archives/openmpi-1.2.8> g++ -v
Using built-in specs.
Target: x86_64-suse-linux
Configured with: ../configure --enable-threads=posix --prefix=/usr --
with-local-prefix=/usr/local --infodir=/usr/share/info --mandir=/usr/
share/man --libdir=/usr/lib64 --libexecdir=/usr/lib64 --enable-
languages=c,c++,objc,fortran,obj-c++,java,ada --enable-
checking=release --with-gxx-include-dir=/usr/include/c++/4.2.1 --
enable-ssp --disable-libssp --disable-libgcj --with-slibdir=/lib64 --
with-system-zlib --enable-shared --enable-__cxa_atexit --enable-
libstdcxx-allocator=new --disable-libstdcxx-pch --program-suffix=-4.2
--enable-version-specific-runtime-libs --without-system-libunwind --
with-cpu=generic --host=x86_64-suse-linux
Thread model: posix
gcc version 4.2.1 (SUSE Linux)

On Nov 10, 2008, at 1:44 PM, Jeff Squyres wrote:

> If you're not using OpenFabrics-based networks, try configuring
> Open MPI --without-memory-manager and see if that fixes your problems.
>
>
> On Nov 8, 2008, at 5:31 PM, Robert Kubrick wrote:
>
>> George, I have warning when running under debugger 'Lowest section
>> in system-supplied DSO at 0xffffe000 is .hash at ffffe0b4'
>> The program hangs in _int_malloc():
>>
>> (gdb) run
>> Starting program: /opt/openmpi-1.2.7/bin/ompi_info
>> warning: Lowest section in system-supplied DSO at 0xffffe000
>> is .hash at ffffe0b4
>> [Thread debugging using libthread_db enabled]
>> [New Thread 0xf7b7d6d0 (LWP 16621)]
>> 1.2.7
>>
>> Program received signal SIGINT, Interrupt.
>> [Switching to Thread 0xf7b7d6d0 (LWP 16621)]
>> 0xf7e5267e in _int_malloc () from /opt/openmpi/lib/libopen-pal.so.0
>> (gdb) where
>> #0 0xf7e5267e in _int_malloc () from /opt/openmpi/lib/libopen-
>> pal.so.0
>> #1 0xf7e544e1 in malloc () from /opt/openmpi/lib/libopen-pal.so.0
>> #2 0xf7db46c7 in operator new () from /usr/lib/libstdc++.so.6
>> #3 0xf7d8e121 in std::string::_Rep::_S_create () from /usr/lib/
>> libstdc++.so.6
>> #4 0xf7d8ee18 in std::string::_Rep::_M_clone () from /usr/lib/
>> libstdc++.so.6
>> #5 0xf7d8fac8 in std::string::reserve () from /usr/lib/libstdc+
>> +.so.6
>> #6 0xf7d8ff6a in std::string::append () from /usr/lib/libstdc++.so.6
>> #7 0x08054f30 in ompi_info::out ()
>> #8 0x08062a33 in ompi_info::show_ompi_version ()
>> #9 0x080533a0 in main ()
>>
>> On Nov 8, 2008, at 12:33 PM, George Bosilca wrote:
>>
>>> I think we had a similar problem on the past. It has something to
>>> do with the atomics on this architecture.
>>>
>>> I don't have access to such an architecture. Can you provide us a
>>> stack trace when this happens ?
>>>
>>> Thanks,
>>> george.
>>>
>>> On Nov 8, 2008, at 12:14 PM, Robert Kubrick wrote:
>>>
>>>> I am having problems building OMPI 1.2.7 on an Intel Xeon quad-
>>>> core 64 bits server. The compilation completes but ompi_info
>>>> hangs after printing the OMPI version:
>>>>
>>>> # ompi_info
>>>> 1.2.7
>>>>
>>>> I tried to run a few mpi applications on this same install and
>>>> they do work fine. What can cause ompi_info to hang?
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> Cisco Systems
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users