Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] MPIR attach from padb broken (1.5.5rc1)
From: Nathan Hjelm (hjelmn_at_[hidden])
Date: 2011-12-15 17:03:47


That appears to be a similar problem to the MPIR_Breakpoint bug. Let me play around and see if I can find a fix.

-Nathan Hjelm
HPC-3, LANL

On Thu, 15 Dec 2011, Ashley Pittman wrote:

>
> There is a problem with 1.5.5rc1 that prevents padb from loading the process table start from the orterun process, what appears to be happening is that MPIR_proctable and MPIR_proctable_size is present in both orterun itself and also in libopen-rte.so, the code is correctly setting them in libopen-rte.so however when gdb is picking the variable from orterun in preference and hence padb is reading NULL values.
>
> Attached is a log showing the problem, the only change I made to the source is to add a call to orte_debugger_base_dump() before the return from orte_debugger_base_init_after_spawn(), it looks like this could also have been achieved via a debug setting but I couldn't see how.
>
> Ashley.
>
> ubuntu_at_c0:/cloud/ubuntu/imb_3.2.3/src$ orterun -H c0,c2 -n 2 sleep 100000&
> [1] 6572
> ubuntu_at_c0:/cloud/ubuntu/imb_3.2.3/src$ MPIR_being_debugged = 0
> MPIR_debug_state = 1
> MPIR_partial_attach_ok = 1
> MPIR_i_am_starter = 0
> MPIR_forward_output = 0
> MPIR_proctable_size = 2
> MPIR_proctable:
> (i, host, exe, pid) = (0, c0, /cloud/ubuntu/imb_3.2.3/src/sleep, 6574)
> (i, host, exe, pid) = (1, c2, /cloud/ubuntu/imb_3.2.3/src/sleep, 17557)
> MPIR_executable_path: NULL
> MPIR_server_arguments: NULL
>
> ubuntu_at_c0:/cloud/ubuntu/imb_3.2.3/src$ ps auwx | grep orterun
> ubuntu 6572 0.1 0.2 60384 2616 pts/0 S 21:50 0:00 orterun -H c0,c2 -n 2 sleep 100000
> ubuntu 6576 0.0 0.0 7964 880 pts/0 S+ 21:50 0:00 grep --color=auto orterun
> ubuntu_at_c0:/cloud/ubuntu/imb_3.2.3/src$ gdb -p 6572
> GNU gdb (Ubuntu/Linaro 7.3-0ubuntu2) 7.3-2011.08
> Copyright (C) 2011 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law. Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-linux-gnu".
> For bug reporting instructions, please see:
> <http://bugs.launchpad.net/gdb-linaro/>.
> Attaching to process 6572
> Reading symbols from /cloud/ubuntu/openmpi-1.5.5rc1/bin/orterun...done.
> Reading symbols from /cloud/ubuntu/openmpi-1.5.5rc1/lib/libopen-rte.so.3...done.
> Loaded symbols for /cloud/ubuntu/openmpi-1.5.5rc1/lib/libopen-rte.so.3
> Reading symbols from /lib/x86_64-linux-gnu/libpthread.so.0...Reading symbols from /usr/lib/debug/lib/x86_64-linux-gnu/libpthread-2.13.so...done.
> [Thread debugging using libthread_db enabled]
> done.
> Loaded symbols for /lib/x86_64-linux-gnu/libpthread.so.0
> Reading symbols from /lib/x86_64-linux-gnu/libc.so.6...Reading symbols from /usr/lib/debug/lib/x86_64-linux-gnu/libc-2.13.so...done.
> done.
> Loaded symbols for /lib/x86_64-linux-gnu/libc.so.6
> Reading symbols from /lib/x86_64-linux-gnu/libdl.so.2...Reading symbols from /usr/lib/debug/lib/x86_64-linux-gnu/libdl-2.13.so...done.
> done.
> Loaded symbols for /lib/x86_64-linux-gnu/libdl.so.2
> Reading symbols from /lib/x86_64-linux-gnu/libutil.so.1...Reading symbols from /usr/lib/debug/lib/x86_64-linux-gnu/libutil-2.13.so...done.
> done.
> Loaded symbols for /lib/x86_64-linux-gnu/libutil.so.1
> Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
> Loaded symbols for /lib64/ld-linux-x86-64.so.2
> Reading symbols from /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_paffinity_hwloc.so...done.
> Loaded symbols for /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_paffinity_hwloc.so
> Reading symbols from /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_carto_auto_detect.so...done.
> Loaded symbols for /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_carto_auto_detect.so
> Reading symbols from /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_shmem_mmap.so...done.
> Loaded symbols for /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_shmem_mmap.so
> Reading symbols from /lib/x86_64-linux-gnu/librt.so.1...Reading symbols from /usr/lib/debug/lib/x86_64-linux-gnu/librt-2.13.so...done.
> done.
> Loaded symbols for /lib/x86_64-linux-gnu/librt.so.1
> Reading symbols from /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_ess_hnp.so...done.
> Loaded symbols for /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_ess_hnp.so
> Reading symbols from /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_pstat_linux.so...done.
> Loaded symbols for /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_pstat_linux.so
> Reading symbols from /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_sysinfo_linux.so...done.
> Loaded symbols for /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_sysinfo_linux.so
> Reading symbols from /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_plm_rsh.so...done.
> Loaded symbols for /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_plm_rsh.so
> Reading symbols from /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_rml_oob.so...done.
> Loaded symbols for /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_rml_oob.so
> Reading symbols from /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_oob_tcp.so...done.
> Loaded symbols for /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_oob_tcp.so
> Reading symbols from /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_routed_binomial.so...done.
> Loaded symbols for /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_routed_binomial.so
> Reading symbols from /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_grpcomm_bad.so...done.
> Loaded symbols for /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_grpcomm_bad.so
> Reading symbols from /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_rmaps_round_robin.so...done.
> Loaded symbols for /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_rmaps_round_robin.so
> Reading symbols from /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_errmgr_default.so...done.
> Loaded symbols for /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_errmgr_default.so
> Reading symbols from /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_odls_default.so...done.
> Loaded symbols for /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_odls_default.so
> Reading symbols from /lib/x86_64-linux-gnu/libnss_compat.so.2...Reading symbols from /usr/lib/debug/lib/x86_64-linux-gnu/libnss_compat-2.13.so...done.
> done.
> Loaded symbols for /lib/x86_64-linux-gnu/libnss_compat.so.2
> Reading symbols from /lib/x86_64-linux-gnu/libnsl.so.1...Reading symbols from /usr/lib/debug/lib/x86_64-linux-gnu/libnsl-2.13.so...done.
> done.
> Loaded symbols for /lib/x86_64-linux-gnu/libnsl.so.1
> Reading symbols from /lib/x86_64-linux-gnu/libnss_nis.so.2...Reading symbols from /usr/lib/debug/lib/x86_64-linux-gnu/libnss_nis-2.13.so...done.
> done.
> Loaded symbols for /lib/x86_64-linux-gnu/libnss_nis.so.2
> Reading symbols from /lib/x86_64-linux-gnu/libnss_files.so.2...Reading symbols from /usr/lib/debug/lib/x86_64-linux-gnu/libnss_files-2.13.so...done.
> done.
> Loaded symbols for /lib/x86_64-linux-gnu/libnss_files.so.2
> Reading symbols from /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_iof_hnp.so...done.
> Loaded symbols for /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_iof_hnp.so
> Reading symbols from /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_filem_rsh.so...done.
> Loaded symbols for /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_filem_rsh.so
> Reading symbols from /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_debugger_mpir.so...done.
> Loaded symbols for /cloud/ubuntu/openmpi-1.5.5rc1/lib/openmpi/mca_debugger_mpir.so
> 0x00007f09d37b4738 in __GI___poll (fds=0xb89010, nfds=10, timeout=1000) at ../sysdeps/unix/sysv/linux/poll.c:83
> 83 ../sysdeps/unix/sysv/linux/poll.c: No such file or directory.
> in ../sysdeps/unix/sysv/linux/poll.c
> (gdb) p MPIR_proctable_size
> $1 = 0
> (gdb) p MPIR_proctable
> $1 = (struct MPIR_PROCDESC *) 0x0
> (gdb) quit
> A debugging session is active.
>
> Inferior 1 [process 6572] will be detached.
>
> Quit anyway? (y or n) y
> Detaching from program: /cloud/ubuntu/openmpi-1.5.5rc1/bin/orterun, process 6572
> ubuntu_at_c0:/cloud/ubuntu/imb_3.2.3/src$ nm /cloud/ubuntu/openmpi-1.5.5rc1/bin/orterun | grep MPIR
> U MPIR_Breakpoint
> 000000000060dc60 B MPIR_attach_fifo
> 000000000060e994 B MPIR_being_debugged
> 000000000060e238 B MPIR_debug_state
> 000000000060de60 B MPIR_executable_path
> 000000000060e990 B MPIR_force_to_main
> 000000000060e0c4 B MPIR_forward_comm
> 000000000060e980 B MPIR_forward_output
> 000000000060e8c8 B MPIR_i_am_starter
> 000000000060e0c0 B MPIR_partial_attach_ok
> 000000000060e988 B MPIR_proctable
> 000000000060e7c0 B MPIR_proctable_size
> 000000000060e360 B MPIR_server_arguments
> ubuntu_at_c0:/cloud/ubuntu/imb_3.2.3/src$ ldd /cloud/ubuntu/openmpi-1.5.5rc1/bin/orterun
> linux-vdso.so.1 => (0x00007fff8eec2000)
> libopen-rte.so.3 => /cloud/ubuntu/openmpi-1.5.5rc1/lib/libopen-rte.so.3 (0x00007f875b209000)
> libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f875afe5000)
> libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f875ac45000)
> libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f875aa41000)
> libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007f875a83e000)
> /lib64/ld-linux-x86-64.so.2 (0x00007f875b504000)
> ubuntu_at_c0:/cloud/ubuntu/imb_3.2.3/src$ nm /cloud/ubuntu/openmpi-1.5.5rc1/lib/libopen-rte.so.3 | grep MPIR
> 0000000000048da0 T MPIR_Breakpoint
> 00000000002f5ac0 B MPIR_attach_fifo
> 00000000002f1dcc B MPIR_being_debugged
> 00000000002f1dc8 B MPIR_debug_state
> 00000000002f5cc0 B MPIR_executable_path
> 00000000002f1db8 B MPIR_force_to_main
> 00000000002f1dbc B MPIR_forward_comm
> 00000000002f1dc0 B MPIR_forward_output
> 00000000002f1dc4 B MPIR_i_am_starter
> 00000000002ef4e0 D MPIR_partial_attach_ok
> 00000000002f1dd8 B MPIR_proctable
> 00000000002f1dd0 B MPIR_proctable_size
> 00000000002f5ee0 B MPIR_server_arguments
> ubuntu_at_c0:/cloud/ubuntu/imb_3.2.3/src$
>
>
>
> Ashley.
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>