Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: [OMPI devel] New crash on trunk (r32246)
From: Rolf vandeVaart (rvandevaart_at_[hidden])
Date: 2014-07-15 09:54:42


With the latest trunk (r32246) I am getting crashes while the program is shutting down. I assume this is related to some of the changes George just made. George, can you take a look when you get a chance?
Looks like everyone is getting the segv during shutdown (mpirun, orted, and application) Stacktrace of the application shows this:

Program terminated with signal 11, Segmentation fault.
#0 0x00007fc48c6a3145 in opal_class_finalize () at ../../opal/class/opal_object.c:175
175 free(cls->cls_construct_array);
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.107.el6_4.5.x86_64 libgcc-4.4.7-3.el6.x86_64
(gdb) where
#0 0x00007fc48c6a3145 in opal_class_finalize () at ../../opal/class/opal_object.c:175
#1 0x00007fc48c6a8253 in opal_finalize_util () at ../../opal/runtime/opal_finalize.c:110
#2 0x00007fc48d2697e9 in ompi_mpi_finalize () at ../../ompi/runtime/ompi_mpi_finalize.c:454
#3 0x00007fc48d2925a9 in PMPI_Finalize () at pfinalize.c:46
#4 0x0000000000401687 in main (argc=1, argv=0x7fff0e936fb8) at isend.c:109
(gdb) quit

mpirun -host drossetti-ivy0,drossetti-ivy1 -np 2 --mca pml ob1 --mca btl sm,tcp,self --mca coll_ml_disable_allgather 1 --mca btl_openib_warn_default_gid_prefix 0 isend [drossetti-ivy0:13073] *** Process received signal *** [drossetti-ivy0:13073] Signal: Segmentation fault (11) [drossetti-ivy0:13073] Signal code: Address not mapped (1) [drossetti-ivy0:13073] Failing at address: 0x7fc48abb2d68 [drossetti-ivy0:13073] [ 0] /lib64/libpthread.so.0(+0xf500)[0x7fc48d005500]
[drossetti-ivy0:13073] [ 1] /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/lib/libopen-pal.so.0(opal_class_finalize+0x4a)[0x7fc48c6a3145]
[drossetti-ivy0:13073] [ 2] /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/lib/libopen-pal.so.0(opal_finalize_util+0xc3)[0x7fc48c6a8253]
[drossetti-ivy0:13073] [ 3] /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/lib/libmpi.so.0(ompi_mpi_finalize+0xc4c)[0x7fc48d2697e9]
[drossetti-ivy0:13073] [ 4] /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/lib/libmpi.so.0(PMPI_Finalize+0x59)[0x7fc48d2925a9]
[drossetti-ivy0:13073] [ 5] isend[0x401687] [drossetti-ivy0:13073] [ 6] /lib64/libc.so.6(__libc_start_main+0xfd)[0x7fc48cc81cdd]
[drossetti-ivy0:13073] [ 7] isend[0x400f49] [drossetti-ivy0:13073] *** End of error message *** [drossetti-ivy1:29629] *** Process received signal *** [drossetti-ivy1:29629] Signal: Segmentation fault (11) [drossetti-ivy1:29629] Signal code: Address not mapped (1) [drossetti-ivy1:29629] Failing at address: 0x7f239ded6d68 [drossetti-ivy1:29629] [ 0] /lib64/libpthread.so.0(+0xf500)[0x7f23a0329500]
[drossetti-ivy1:29629] [ 1] /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/lib/libopen-pal.so.0(opal_class_finalize+0x4a)[0x7f239f9c7145]
[drossetti-ivy1:29629] [ 2] /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/lib/libopen-pal.so.0(opal_finalize_util+0xc3)[0x7f239f9cc253]
[drossetti-ivy1:29629] [ 3] /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/lib/libmpi.so.0(ompi_mpi_finalize+0xc4c)[0x7f23a058d7e9]
[drossetti-ivy1:29629] [ 4] /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/lib/libmpi.so.0(PMPI_Finalize+0x59)[0x7f23a05b65a9]
[drossetti-ivy1:29629] [ 5] isend[0x401687] [drossetti-ivy1:29629] [ 6] /lib64/libc.so.6(__libc_start_main+0xfd)[0x7f239ffa5cdd]
[drossetti-ivy1:29629] [ 7] isend[0x400f49] [drossetti-ivy1:29629] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 0 on node drossetti-ivy0 exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
[drossetti-ivy0:13070] *** Process received signal *** [drossetti-ivy0:13070] Signal: Segmentation fault (11) [drossetti-ivy0:13070] Signal code: Address not mapped (1) [drossetti-ivy0:13070] Failing at address: 0x7eff348fbd68 [drossetti-ivy0:13070] [ 0] /lib64/libpthread.so.0(+0xf500)[0x7eff362d1500]
[drossetti-ivy0:13070] [ 1] /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/lib/libopen-pal.so.0(opal_class_finalize+0x4a)[0x7eff36fb4145]
[drossetti-ivy0:13070] [ 2] /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/lib/libopen-pal.so.0(opal_finalize_util+0xc3)[0x7eff36fb9253]
[drossetti-ivy0:13070] [ 3] /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/lib/libopen-pal.so.0(opal_finalize+0x105)[0x7eff36fb935f]
[drossetti-ivy0:13070] [ 4] /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/lib/libopen-rte.so.0(orte_finalize+0xd3)[0x7eff372b9f9f]
[drossetti-ivy0:13070] [ 5] mpirun(orterun+0x15b5)[0x40573e] [drossetti-ivy0:13070] [ 6] mpirun(main+0x20)[0x403a14] [drossetti-ivy0:13070] [ 7] /lib64/libc.so.6(__libc_start_main+0xfd)[0x7eff35f4dcdd]
[drossetti-ivy0:13070] [ 8] mpirun[0x403939] [drossetti-ivy0:13070] *** End of error message *** [drossetti-ivy1:29628] *** Process received signal *** [drossetti-ivy1:29628] Signal: Segmentation fault (11) [drossetti-ivy1:29628] Signal code: Address not mapped (1) [drossetti-ivy1:29628] Failing at address: 0x7fc78217ed68 [drossetti-ivy1:29628] [ 0] /lib64/libpthread.so.0(+0xf500)[0x7fc783b4f500]
[drossetti-ivy1:29628] [ 1] /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/lib/libopen-pal.so.0(opal_class_finalize+0x4a)[0x7fc784832145]
[drossetti-ivy1:29628] [ 2] /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/lib/libopen-pal.so.0(opal_finalize_util+0xc3)[0x7fc784837253]
[drossetti-ivy1:29628] [ 3] /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/lib/libopen-pal.so.0(opal_finalize+0x105)[0x7fc78483735f]
[drossetti-ivy1:29628] [ 4] /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/lib/libopen-rte.so.0(orte_finalize+0xd3)[0x7fc784b37f9f]
[drossetti-ivy1:29628] [ 5] /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/lib/libopen-rte.so.0(orte_daemon+0x23b9)[0x7fc784b6a47d]
[drossetti-ivy1:29628] [ 6] /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/bin/orted(main+0x86)[0x40094a]
[drossetti-ivy1:29628] [ 7] /lib64/libc.so.6(__libc_start_main+0xfd)[0x7fc7837cbcdd]
[drossetti-ivy1:29628] [ 8] /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/bin/orted[0x400809]
[drossetti-ivy1:29628] *** End of error message ***
bash: line 1: 29628 Segmentation fault /ivylogin/home/rvandevaart/ompi-repos/ompi-trunk-original/64-dbg-nocuda/bin/orted -mca ess env -mca orte_ess_jobid 3963420672 -mca orte_ess_vpid 1 -mca orte_ess_num_procs 2 -mca orte_hnp_uri "3963420672.0;tcp://10.31.124.51:45481" --tree-spawn --mca pml ob1 --mca btl sm,tcp,self --mca coll_ml_disable_allgather 1 --mca btl_openib_warn_default_gid_prefix 0 -mca plm rsh -mca dstore ^pmi
Segmentation fault (core dumped)

-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information. Any unauthorized review, use, disclosure or distribution
is prohibited. If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------