Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Brian Barrett (brbarret_at_[hidden])
Date: 2006-08-30 17:32:37


Hi all-

A question about stack tracing. Currently, we have it setup so that,
say, a segfault results in:

[0]func:/u/jjhursey/local/odin/ompi/devel/lib/libopal.so.0(opal_backtrace_print+0x2b) [0x2a959166ab]
[1] func:/u/jjhursey/local/odin/ompi/devel/lib/libopal.so.0 [0x2a959150bb]
[2] func:/lib64/tls/libpthread.so.0 [0x345cc0c420]
[3] func:/san/homedirs/jjhursey/local/odin//ompi/devel/lib/openmpi/mca_oob_tcp.so(mca_oob_tcp_recv+0x480) [0x2a95fd6354]
[4] func:/u/jjhursey/local/odin/ompi/devel/lib/liborte.so.0(mca_oob_recv_packed+0x46) [0x2a957a96a3]
[5] func:/u/jjhursey/local/odin/ompi/devel/lib/libmpi.so.0(ompi_comm_connect_accept+0x1d8) [0x2a955a29dc]
[6] func:/u/jjhursey/local/odin/ompi/devel/lib/libmpi.so.0(ompi_comm_dyn_init+0x110) [0x2a955a49e0]

This seems to result in confusion from some users (not josh, I was just
reading his latest bug when I thought of this) that the error must be in
OMPI because that's where it segfaulted. It would be fairly trivial (at
least, on Linux and OS X) to not print the last 3 lines such that the
error looked like:

[0] func:/san/homedirs/jjhursey/local/odin//ompi/devel/lib/openmpi/mca_oob_tcp.so(mca_oob_tcp_recv+0x480) [0x2a95fd6354]
[1] func:/u/jjhursey/local/odin/ompi/devel/lib/liborte.so.0(mca_oob_recv_packed+0x46) [0x2a957a96a3]
[2] func:/u/jjhursey/local/odin/ompi/devel/lib/libmpi.so.0(ompi_comm_connect_accept+0x1d8) [0x2a955a29dc]
[3] func:/u/jjhursey/local/odin/ompi/devel/lib/libmpi.so.0(ompi_comm_dyn_init+0x110) [0x2a

Would anyone object to such a change?

Brian