Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: [OMPI devel] Intermittent mpirun crash?
From: Rolf vandeVaart (rvandevaart_at_[hidden])
Date: 2014-01-30 11:26:44

I am seeing this happening to me very intermittently. Looks like mpirun is getting a SEGV. Is anyone else seeing this?
This is 1.7.4 built yesterday. (Note that I added some stuff to what is being printed out so the message is slightly different than 1.7.4 output)

mpirun - -np 6 -host drossetti-ivy0,drossetti-ivy1,drossetti-ivy2,drossetti-ivy3 --mca btl_openib_warn_default_gid_prefix 0 -- `pwd`/src/MPI_Waitsome_p_c
MPITEST info (0): Starting: MPI_Waitsome_p: Persistent Waitsome using two nodes
MPITEST_results: MPI_Waitsome_p: Persistent Waitsome using two nodes all tests PASSED (742)
[drossetti-ivy0:10353] *** Process (mpirun)received signal ***
[drossetti-ivy0:10353] Signal: Segmentation fault (11)
[drossetti-ivy0:10353] Signal code: Address not mapped (1)
[drossetti-ivy0:10353] Failing at address: 0x7fd31e5f208d
[drossetti-ivy0:10353] End of signal information - not sleeping
gmake[1]: *** [MPI_Waitsome_p_c] Segmentation fault (core dumped)
gmake[1]: Leaving directory `/geppetto/home/rvandevaart/public/ompi-tests/trunk/intel_tests'

(gdb) where
#0 0x00007fd31f620807 in ?? () from /lib64/
#1 0x00007fd31f6210b9 in _Unwind_Backtrace () from /lib64/
#2 0x00007fd31fb2893e in backtrace () from /lib64/
#3 0x00007fd320b0d622 in opal_backtrace_buffer (message_out=0x7fd31e5e33a0, len_out=0x7fd31e5e33ac)
    at ../../../../../opal/mca/backtrace/execinfo/backtrace_execinfo.c:57
#4 0x00007fd320b0a794 in show_stackframe (signo=11, info=0x7fd31e5e3930, p=0x7fd31e5e3800) at ../../../opal/util/stacktrace.c:354
#5 <signal handler called>
#6 0x00007fd31e5f208d in ?? ()
#7 0x00007fd31e5e46d8 in ?? ()
#8 0x000000000000c2a8 in ?? ()
#9 0x0000000000000000 in ?? ()

This email message is for the sole use of the intended recipient(s) and may contain
confidential information. Any unauthorized review, use, disclosure or distribution
is prohibited. If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.