Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] r31765 causes crash in mpirun
From: Ralph Castain (rhc_at_[hidden])
Date: 2014-05-15 12:49:26


I fixed this by reverting r31765 in r31775. Annotated ticket with explanation.

On May 15, 2014, at 1:20 AM, Gilles Gouaillardet <gilles.gouaillardet_at_[hidden]> wrote:

> Folks,
>
> since r31765 (opal/event: release the opal event context when closing
> the event base)
> mpirun crashes at the end of the job.
>
> for example :
>
> $ mpirun --mca btl tcp,self -n 4 `pwd`/src/MPI_Allreduce_user_c
> MPITEST info (0): Starting MPI_Allreduce_user() test
> MPITEST_results: MPI_Allreduce_user() all tests PASSED (7076)
> [soleil:10959] *** Process received signal ***
> [soleil:10959] Signal: Segmentation fault (11)
> [soleil:10959] Signal code: Address not mapped (1)
> [soleil:10959] Failing at address: 0x7fd969e75a98
> [soleil:10959] [ 0] /lib64/libpthread.so.0[0x3c9da0f500]
> [soleil:10959] [ 1]
> /csc/home1/gouaillardet/local/ompi-trunk/lib/libopen-pal.so.0(+0x7bae5)[0x7fd96a55dae5]
> [soleil:10959] [ 2]
> /csc/home1/gouaillardet/local/ompi-trunk/lib/libopen-pal.so.0(+0x7ac97)[0x7fd96a55cc97]
> [soleil:10959] [ 3]
> /csc/home1/gouaillardet/local/ompi-trunk/lib/libopen-pal.so.0(opal_libevent2021_event_del+0x88)[0x7fd96a55ca15]
> [soleil:10959] [ 4]
> /csc/home1/gouaillardet/local/ompi-trunk/lib/libopen-pal.so.0(opal_libevent2021_event_base_free+0x132)[0x7fd96a558831]
> [soleil:10959] [ 5]
> /csc/home1/gouaillardet/local/ompi-trunk/lib/libopen-pal.so.0(+0x74126)[0x7fd96a556126]
> [soleil:10959] [ 6]
> /csc/home1/gouaillardet/local/ompi-trunk/lib/libopen-pal.so.0(mca_base_framework_close+0xdd)[0x7fd96a54026f]
> [soleil:10959] [ 7]
> /csc/home1/gouaillardet/local/ompi-trunk/lib/libopen-pal.so.0(opal_finalize+0x7e)[0x7fd96a50d36e]
> [soleil:10959] [ 8]
> /csc/home1/gouaillardet/local/ompi-trunk/lib/libopen-rte.so.0(orte_finalize+0xd3)[0x7fd96a7ead2f]
> [soleil:10959] [ 9] mpirun(orterun+0x1298)[0x404f0e]
> [soleil:10959] [10] mpirun(main+0x20)[0x4038a4]
> [soleil:10959] [11] /lib64/libc.so.6(__libc_start_main+0xfd)[0x3c9d21ecdd]
> [soleil:10959] [12] mpirun[0x4037c9]
> [soleil:10959] *** End of error message ***
> Segmentation fault (core dumped)
>
> Gilles
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: http://www.open-mpi.org/community/lists/devel/2014/05/14806.php