Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: [OMPI devel] MPIR attach from padb broken (1.5.5rc1)
From: Ashley Pittman (ashley_at_[hidden])
Date: 2011-12-15 16:55:20


There is a problem with 1.5.5rc1 that prevents padb from loading the process table start from the orterun process, what appears to be happening is that MPIR_proctable and MPIR_proctable_size is present in both orterun itself and also in libopen-rte.so, the code is correctly setting them in libopen-rte.so however when gdb is picking the variable from orterun in preference and hence padb is reading NULL values.

Attached is a log showing the problem, the only change I made to the source is to add a call to orte_debugger_base_dump() before the return from orte_debugger_base_init_after_spawn(), it looks like this could also have been achieved via a debug setting but I couldn't see how.

Ashley.

ubuntu_at_c0:/cloud/ubuntu/imb_3.2.3/src$ orterun -H c0,c2 -n 2 sleep 100000&
[1] 6572
ubuntu_at_c0:/cloud/ubuntu/imb_3.2.3/src$ MPIR_being_debugged = 0
  MPIR_debug_state = 1
  MPIR_partial_attach_ok = 1
  MPIR_i_am_starter = 0
  MPIR_forward_output = 0
  MPIR_proctable_size = 2
  MPIR_proctable:
    (i, host, exe, pid) = (0, c0, /cloud/ubuntu/imb_3.2.3/src/sleep, 6574)
    (i, host, exe, pid) = (1, c2, /cloud/ubuntu/imb_3.2.3/src/sleep, 17557)
MPIR_executable_path: NULL
MPIR_server_arguments: NULL

ubuntu_at_c0:/cloud/ubuntu/imb_3.2.3/src$ ps auwx | grep orterun
ubuntu 6572 0.1 0.2 60384 2616 pts/0 S 21:50 0:00 orterun -H c0,c2 -n 2 sleep 100000
ubuntu 6576 0.0 0.0 7964 880 pts/0 S+ 21:50 0:00 grep --color=auto orterun
ubuntu_at_c0:/cloud/ubuntu/imb_3.2.3/src$ gdb -p 6572
GNU gdb (Ubuntu/Linaro 7.3-0ubuntu2) 7.3-2011.08
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
<http://bugs.launchpad.net/gdb-linaro/>.
Attaching to process 6572
Reading symbols from /cloud/ubuntu/openmpi-1.5.5rc1/bin/orterun...done.
Reading symbols from /cloud/ubuntu/openmpi-1.5.5rc1/lib/libopen-rte.so.3...done.
Loaded symbols for /cloud/ubuntu/openmpi-1.5.5rc1/lib/libopen-rte.so.3
Reading symbols from /lib/x86_64-linux-gnu/libpthread.so.0...Reading symbols from /usr/lib/debug/lib/x86_64-linux-gnu/libpthread-2.13.so...done.
[Thread debugging using libthread_db enabled]
done.
Loaded symbols for /lib/x86_64-linux-gnu/libpthread.so.0
Reading symbols from /lib/x86_64-linux-gnu/libc.so.6...Reading symbols from /usr/lib/debug/lib/x86_64-linux-gnu/libc-2.13.so...done.
done.
Loaded symbols for /lib/x86_64-linux-gnu/libc.so.6
Reading symbols from /lib/x86_64-linux-gnu/libdl.so.2...Reading symbols from /usr/lib/debug/lib/x86_64-linux-gnu/libdl-2.13.so...done.
done.
Loaded symbols for /lib/x86_64-linux-gnu/libdl.so.2
Reading symbols from /lib/x86_64-linux-gnu/libutil.so.1...Reading symbols from /usr/lib/debug/lib/x86_64-linux-gnu/libutil-2.13.so...done.
done.
Loaded symbols for /lib/x86_64-linux-gnu/libutil.so.1
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_paffinity_hwloc.so...done.
Loaded symbols for /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_paffinity_hwloc.so
Reading symbols from /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_carto_auto_detect.so...done.
Loaded symbols for /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_carto_auto_detect.so
Reading symbols from /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_shmem_mmap.so...done.
Loaded symbols for /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_shmem_mmap.so
Reading symbols from /lib/x86_64-linux-gnu/librt.so.1...Reading symbols from /usr/lib/debug/lib/x86_64-linux-gnu/librt-2.13.so...done.
done.
Loaded symbols for /lib/x86_64-linux-gnu/librt.so.1
Reading symbols from /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_ess_hnp.so...done.
Loaded symbols for /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_ess_hnp.so
Reading symbols from /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_pstat_linux.so...done.
Loaded symbols for /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_pstat_linux.so
Reading symbols from /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_sysinfo_linux.so...done.
Loaded symbols for /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_sysinfo_linux.so
Reading symbols from /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_plm_rsh.so...done.
Loaded symbols for /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_plm_rsh.so
Reading symbols from /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_rml_oob.so...done.
Loaded symbols for /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_rml_oob.so
Reading symbols from /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_oob_tcp.so...done.
Loaded symbols for /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_oob_tcp.so
Reading symbols from /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_routed_binomial.so...done.
Loaded symbols for /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_routed_binomial.so
Reading symbols from /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_grpcomm_bad.so...done.
Loaded symbols for /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_grpcomm_bad.so
Reading symbols from /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_rmaps_round_robin.so...done.
Loaded symbols for /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_rmaps_round_robin.so
Reading symbols from /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_errmgr_default.so...done.
Loaded symbols for /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_errmgr_default.so
Reading symbols from /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_odls_default.so...done.
Loaded symbols for /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_odls_default.so
Reading symbols from /lib/x86_64-linux-gnu/libnss_compat.so.2...Reading symbols from /usr/lib/debug/lib/x86_64-linux-gnu/libnss_compat-2.13.so...done.
done.
Loaded symbols for /lib/x86_64-linux-gnu/libnss_compat.so.2
Reading symbols from /lib/x86_64-linux-gnu/libnsl.so.1...Reading symbols from /usr/lib/debug/lib/x86_64-linux-gnu/libnsl-2.13.so...done.
done.
Loaded symbols for /lib/x86_64-linux-gnu/libnsl.so.1
Reading symbols from /lib/x86_64-linux-gnu/libnss_nis.so.2...Reading symbols from /usr/lib/debug/lib/x86_64-linux-gnu/libnss_nis-2.13.so...done.
done.
Loaded symbols for /lib/x86_64-linux-gnu/libnss_nis.so.2
Reading symbols from /lib/x86_64-linux-gnu/libnss_files.so.2...Reading symbols from /usr/lib/debug/lib/x86_64-linux-gnu/libnss_files-2.13.so...done.
done.
Loaded symbols for /lib/x86_64-linux-gnu/libnss_files.so.2
Reading symbols from /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_iof_hnp.so...done.
Loaded symbols for /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_iof_hnp.so
Reading symbols from /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_filem_rsh.so...done.
Loaded symbols for /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_filem_rsh.so
Reading symbols from /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_debugger_mpir.so...done.
Loaded symbols for /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_debugger_mpir.so
0x00007f09d37b4738 in __GI___poll (fds=0xb89010, nfds=10, timeout=1000) at ../sysdeps/unix/sysv/linux/poll.c:83
83 ../sysdeps/unix/sysv/linux/poll.c: No such file or directory.
        in ../sysdeps/unix/sysv/linux/poll.c
(gdb) p MPIR_proctable_size
$1 = 0
(gdb) p MPIR_proctable
$1 = (struct MPIR_PROCDESC *) 0x0
(gdb) quit
A debugging session is active.

        Inferior 1 [process 6572] will be detached.

Quit anyway? (y or n) y
Detaching from program: /cloud/ubuntu/openmpi-1.5.5rc1/bin/orterun, process 6572
ubuntu_at_c0:/cloud/ubuntu/imb_3.2.3/src$ nm /cloud/ubuntu/openmpi-1.5.5rc1/bin/orterun | grep MPIR
                 U MPIR_Breakpoint
000000000060dc60 B MPIR_attach_fifo
000000000060e994 B MPIR_being_debugged
000000000060e238 B MPIR_debug_state
000000000060de60 B MPIR_executable_path
000000000060e990 B MPIR_force_to_main
000000000060e0c4 B MPIR_forward_comm
000000000060e980 B MPIR_forward_output
000000000060e8c8 B MPIR_i_am_starter
000000000060e0c0 B MPIR_partial_attach_ok
000000000060e988 B MPIR_proctable
000000000060e7c0 B MPIR_proctable_size
000000000060e360 B MPIR_server_arguments
ubuntu_at_c0:/cloud/ubuntu/imb_3.2.3/src$ ldd /cloud/ubuntu/openmpi-1.5.5rc1/bin/orterun
        linux-vdso.so.1 => (0x00007fff8eec2000)
        libopen-rte.so.3 => /cloud/ubuntu/openmpi-1.5.5rc1/lib/libopen-rte.so.3 (0x00007f875b209000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f875afe5000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f875ac45000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f875aa41000)
        libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007f875a83e000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f875b504000)
ubuntu_at_c0:/cloud/ubuntu/imb_3.2.3/src$ nm /cloud/ubuntu/openmpi-1.5.5rc1/lib/libopen-rte.so.3 | grep MPIR
0000000000048da0 T MPIR_Breakpoint
00000000002f5ac0 B MPIR_attach_fifo
00000000002f1dcc B MPIR_being_debugged
00000000002f1dc8 B MPIR_debug_state
00000000002f5cc0 B MPIR_executable_path
00000000002f1db8 B MPIR_force_to_main
00000000002f1dbc B MPIR_forward_comm
00000000002f1dc0 B MPIR_forward_output
00000000002f1dc4 B MPIR_i_am_starter
00000000002ef4e0 D MPIR_partial_attach_ok
00000000002f1dd8 B MPIR_proctable
00000000002f1dd0 B MPIR_proctable_size
00000000002f5ee0 B MPIR_server_arguments
ubuntu_at_c0:/cloud/ubuntu/imb_3.2.3/src$

Ashley.