Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] segv in ompi_info
From: Gilles Gouaillardet (gilles.gouaillardet_at_[hidden])
Date: 2014-07-09 06:47:45


Mike,

how do you test ?
i cannot reproduce a bug :

if you run ompi_info -a -l 9 | less

and i press 'q' at the early stage (e.g. before all output is written to
the pipe)
then the less process exits and receives SIG_PIPE and crash (which is a
normal unix behaviour)

now if i press the spacebar until the end of the output (e.g. i get the
(END) message from less)
and then press 'q', then there is no problem.

strace -e signal ompi_info -a -l 9 | true
will cause ompi_info receives a SIG_PIPE

strace -e signal dd if=/dev/zero bs=1M count=1 | true
will cause dd receives a SIG_PIPE

unless i miss something, i would conclude there is no bug

Cheers,

Gilles

On 2014/07/09 19:33, Mike Dubman wrote:
> mxm only intercept signals and prints the stacktrace.
> happens on trunk as well.
> only when "| less" is used.
>
>
>
>
>
>
> On Tue, Jul 8, 2014 at 4:50 PM, Jeff Squyres (jsquyres) <jsquyres_at_[hidden]>
> wrote:
>
>> I'm unable to replicate. Please provide more detail...? Is this a
>> problem in the MXM component?
>>
>> On Jul 8, 2014, at 9:20 AM, Mike Dubman <miked_at_[hidden]> wrote:
>>
>>>
>>> $/usr/mpi/gcc/openmpi-1.8.2a1/bin/ompi_info -a -l 9|less
>>> Caught signal 13 (Broken pipe)
>>> ==== backtrace ====
>>> 2 0x0000000000054cac mxm_handle_error()
>> /var/tmp/OFED_topdir/BUILD/mxm-3.2.2883/src/mxm/util/debug/debug.c:653
>>> 3 0x0000000000054e74 mxm_error_signal_handler()
>> /var/tmp/OFED_topdir/BUILD/mxm-3.2.2883/src/mxm/util/debug/debug.c:628
>>> 4 0x00000033fbe32920 killpg() ??:0
>>> 5 0x00000033fbedb650 __write_nocancel() interp.c:0
>>> 6 0x00000033fbe71d53 _IO_file_write@@GLIBC_2.2.5() ??:0
>>> 7 0x00000033fbe73305 _IO_do_write@@GLIBC_2.2.5() ??:0
>>> 8 0x00000033fbe719cd _IO_file_xsputn@@GLIBC_2.2.5() ??:0
>>> 9 0x00000033fbe48410 _IO_vfprintf() ??:0
>>> 10 0x00000033fbe4f40a printf() ??:0
>>> 11 0x000000000002bc84 opal_info_out()
>> /var/tmp/OFED_topdir/BUILD/openmpi-1.8.2a1/opal/runtime/opal_info_support.c:853
>>> 12 0x000000000002c6bb opal_info_show_mca_group_params()
>> /var/tmp/OFED_topdir/BUILD/openmpi-1.8.2a1/opal/runtime/opal_info_support.c:658
>>> 13 0x000000000002c882 opal_info_show_mca_group_params()
>> /var/tmp/OFED_topdir/BUILD/openmpi-1.8.2a1/opal/runtime/opal_info_support.c:716
>>> 14 0x000000000002cc13 opal_info_show_mca_params()
>> /var/tmp/OFED_topdir/BUILD/openmpi-1.8.2a1/opal/runtime/opal_info_support.c:742
>>> 15 0x000000000002d074 opal_info_do_params()
>> /var/tmp/OFED_topdir/BUILD/openmpi-1.8.2a1/opal/runtime/opal_info_support.c:485
>>> 16 0x000000000040167b main() ??:0
>>> 17 0x00000033fbe1ecdd __libc_start_main() ??:0
>>> 18 0x0000000000401349 _start() ??:0
>>> ===================
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/07/15075.php
>>
>>
>> --
>> Jeff Squyres
>> jsquyres_at_[hidden]
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/07/15076.php
>>
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: http://www.open-mpi.org/community/lists/devel/2014/07/15080.php