On 07/23/2013 09:54 AM, Jeff Squyres (jsquyres) wrote:
> I don't know if Fedora RPMs include -g in their builds, or if Fedora
> includes a debuginfo RPM that you could install such that you can attach
> a debugger and be able to dig into OMPI's internals yourself.
There is a debuginfo package.
Since I removed all of fedora's openmpi packages and installed from
source into /opt/openmpi-1.6.5 and /opt/openmpi-1.6.5_hwloc-1.4.3 to
narrow down on this problem, I now have to re-install the rpms with yum.
sudo yum install openmpi openmpi-devel openmpi-debuginfo
These don't put anything into my PATH or LD_LIBRARY_PATH so I have to :
module load mpi/openmpi-x86_64
I compiled my simple program with :
mpicc -g -o mpi_simple mpi_simple.c
The program links to fedora's copies of the libraries of interest :
mpirun -n 1 ldd mpi_simple | grep hwloc
libhwloc.so.5 => /lib64/libhwloc.so.5 (0x0000003c57600000)
mpirun -n 1 ldd mpi_simple | grep mpi
libmpi.so.1 => /usr/lib64/openmpi/lib/libmpi.so.1 (0x00007f7207e29000)
I started the debugger with :
mpirun -n 1 gdb mpi_simple
When run in the debugger I got the error I described.
I reran and in gdb did :
set breakpoint pending on
took me into 'opal_dss_unpack' Then I did 'next' until I got passed
'opal_dss_unpack_buffer' which returned the -1 we see outside.