Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: [OMPI devel] New crash on trunk (r32246)
From: Rolf vandeVaart (rvandevaart_at_[hidden])
Date: 2014-07-15 09:54:42


With the latest trunk (r32246) I am getting crashes while the program is shutting down. I assume this is related to some of the changes George just made. George, can you take a look when you get a chance?
Looks like everyone is getting the segv during shutdown (mpirun, orted, and application) Stacktrace of the application shows this:

Program terminated with signal 11, Segmentation fault.
#0 0x00007fc48c6a3145 in opal_class_finalize () at ../../opal/class/opal_object.c:175
175 free(cls->cls_construct_array);
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.107.el6_4.5.x86_64 libgcc-4.4.7-3.el6.x86_64
(gdb) where
#0 0x00007fc48c6a3145 in opal_class_finalize () at ../../opal/class/opal_object.c:175
#1 0x00007fc48c6a8253 in opal_finalize_util () at ../../opal/runtime/opal_finalize.c:110
#2 0x00007fc48d2697e9 in ompi_mpi_finalize () at ../../ompi/runtime/ompi_mpi_finalize.c:454
#3 0x00007fc48d2925a9 in PMPI_Finalize () at pfinalize.c:46
#4 0x0000000000401687 in main (argc=1, argv=0x7fff0e936fb8) at isend.c:109
(gdb) quit

mpirun -host drossetti-ivy0,drossetti-ivy1 -np 2 --mca pml ob1 --mca btl sm,tcp,self --mca coll_ml_disable_allgather 1 --mca btl_openib_warn_default_gid_prefix 0 isend [drossetti-ivy0:13073] *** Process received signal *** [drossetti-ivy0:13073] Signal: Segmentation fault (11) [drossetti-ivy0:13073] Signal code: Address not mapped (1) [drossetti-ivy0:13073] Failing at address: 0x7fc48abb2d68 [drossetti-ivy0:13073] [ 0] /lib64/libpthread.so.0(+0xf500)[0x7fc48d005500]
[drossetti-ivy0:13073] [ 1] /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/lib/libopen-pal.so.0(opal_class_finalize+0x4a)[0x7fc48c6a3145]
[drossetti-ivy0:13073] [ 2] /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/lib/libopen-pal.so.0(opal_finalize_util+0xc3)[0x7fc48c6a8253]
[drossetti-ivy0:13073] [ 3] /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/lib/libmpi.so.0(ompi_mpi_finalize+0xc4c)[0x7fc48d2697e9]
[drossetti-ivy0:13073] [ 4] /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/lib/libmpi.so.0(PMPI_Finalize+0x59)[0x7fc48d2925a9]
[drossetti-ivy0:13073] [ 5] isend[0x401687] [drossetti-ivy0:13073] [ 6] /lib64/libc.so.6(__libc_start_main+0xfd)[0x7fc48cc81cdd]
[drossetti-ivy0:13073] [ 7] isend[0x400f49] [drossetti-ivy0:13073] *** End of error message *** [drossetti-ivy1:29629] *** Process received signal *** [drossetti-ivy1:29629] Signal: Segmentation fault (11) [drossetti-ivy1:29629] Signal code: Address not mapped (1) [drossetti-ivy1:29629] Failing at address: 0x7f239ded6d68 [drossetti-ivy1:29629] [ 0] /lib64/libpthread.so.0(+0xf500)[0x7f23a0329500]
[drossetti-ivy1:29629] [ 1] /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/lib/libopen-pal.so.0(opal_class_finalize+0x4a)[0x7f239f9c7145]
[drossetti-ivy1:29629] [ 2] /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/lib/libopen-pal.so.0(opal_finalize_util+0xc3)[0x7f239f9cc253]
[drossetti-ivy1:29629] [ 3] /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/lib/libmpi.so.0(ompi_mpi_finalize+0xc4c)[0x7f23a058d7e9]
[drossetti-ivy1:29629] [ 4] /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/lib/libmpi.so.0(PMPI_Finalize+0x59)[0x7f23a05b65a9]
[drossetti-ivy1:29629] [ 5] isend[0x401687] [drossetti-ivy1:29629] [ 6] /lib64/libc.so.6(__libc_start_main+0xfd)[0x7f239ffa5cdd]
[drossetti-ivy1:29629] [ 7] isend[0x400f49] [drossetti-ivy1:29629] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 0 on node drossetti-ivy0 exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
[drossetti-ivy0:13070] *** Process received signal *** [drossetti-ivy0:13070] Signal: Segmentation fault (11) [drossetti-ivy0:13070] Signal code: Address not mapped (1) [drossetti-ivy0:13070] Failing at address: 0x7eff348fbd68 [drossetti-ivy0:13070] [ 0] /lib64/libpthread.so.0(+0xf500)[0x7eff362d1500]
[drossetti-ivy0:13070] [ 1] /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/lib/libopen-pal.so.0(opal_class_finalize+0x4a)[0x7eff36fb4145]
[drossetti-ivy0:13070] [ 2] /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/lib/libopen-pal.so.0(opal_finalize_util+0xc3)[0x7eff36fb9253]
[drossetti-ivy0:13070] [ 3] /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/lib/libopen-pal.so.0(opal_finalize+0x105)[0x7eff36fb935f]
[drossetti-ivy0:13070] [ 4] /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/lib/libopen-rte.so.0(orte_finalize+0xd3)[0x7eff372b9f9f]
[drossetti-ivy0:13070] [ 5] mpirun(orterun+0x15b5)[0x40573e] [drossetti-ivy0:13070] [ 6] mpirun(main+0x20)[0x403a14] [drossetti-ivy0:13070] [ 7] /lib64/libc.so.6(__libc_start_main+0xfd)[0x7eff35f4dcdd]
[drossetti-ivy0:13070] [ 8] mpirun[0x403939] [drossetti-ivy0:13070] *** End of error message *** [drossetti-ivy1:29628] *** Process received signal *** [drossetti-ivy1:29628] Signal: Segmentation fault (11) [drossetti-ivy1:29628] Signal code: Address not mapped (1) [drossetti-ivy1:29628] Failing at address: 0x7fc78217ed68 [drossetti-ivy1:29628] [ 0] /lib64/libpthread.so.0(+0xf500)[0x7fc783b4f500]
[drossetti-ivy1:29628] [ 1] /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/lib/libopen-pal.so.0(opal_class_finalize+0x4a)[0x7fc784832145]
[drossetti-ivy1:29628] [ 2] /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/lib/libopen-pal.so.0(opal_finalize_util+0xc3)[0x7fc784837253]
[drossetti-ivy1:29628] [ 3] /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/lib/libopen-pal.so.0(opal_finalize+0x105)[0x7fc78483735f]
[drossetti-ivy1:29628] [ 4] /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/lib/libopen-rte.so.0(orte_finalize+0xd3)[0x7fc784b37f9f]
[drossetti-ivy1:29628] [ 5] /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/lib/libopen-rte.so.0(orte_daemon+0x23b9)[0x7fc784b6a47d]
[drossetti-ivy1:29628] [ 6] /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/bin/orted(main+0x86)[0x40094a]
[drossetti-ivy1:29628] [ 7] /lib64/libc.so.6(__libc_start_main+0xfd)[0x7fc7837cbcdd]
[drossetti-ivy1:29628] [ 8] /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/bin/orted[0x400809]
[drossetti-ivy1:29628] *** End of error message ***
bash: line 1: 29628 Segmentation fault /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/bin/orted -mca ess env -mca orte_ess_jobid 3963420672 -mca orte_ess_vpid 1 -mca orte_ess_num_procs 2 -mca orte_hnp_uri "3963420672.0;tcp://10.31.124.51:45481" --tree-spawn --mca pml ob1 --mca btl sm,tcp,self --mca coll_ml_disable_allgather 1 --mca btl_openib_warn_default_gid_prefix 0 -mca plm rsh -mca dstore ^pmi
Segmentation fault (core dumped)

-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information. Any unauthorized review, use, disclosure or distribution
is prohibited. If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------