Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Openmpi 1.6.5 is freezing under GNU/Linux ia64
From: Ralph Castain (rhc_at_[hidden])
Date: 2013-09-25 13:08:38


Wow - that is hard to understand as that code path hasn't changed in quite some time. Could you please do two things for us?

1. tell us how you are configuring OMPI

2. try the 1.7 branch using that same configuration

The 1.6 series is reaching its planned end-of-life, so we are trying to decide how important it is to chase this down - i.e., if you see the same problem on Debian with 1.7, then this becomes far more important.

Thanks
Ralph

On Sep 25, 2013, at 8:30 AM, Sylvestre Ledru <sylvestre_at_[hidden]> wrote:

> With the --enable-debug, I am getting:
> openmpi-1.6.5/debian/tmp/usr/bin/mpirun.openmpi -mca plm_base_verbose 5
> -mca ras_base_verbose 5 -mca rmaps_base_verbose 5 -mca ess_base_verbose
> 5 -c 4 foo
> mpirun.openmpi: orterun.c:636: orterun: Assertion `((0xdeafbeedULL <<
> 32) + 0xdeafbeedULL) == ((opal_object_t *) (&cmd_line))->obj_magic_id'
> failed.
> [merulo:16918] *** Process received signal ***
> [merulo:16918] Signal: Aborted (6)
> [merulo:16918] Signal code: (-6)
> [merulo:16918] [ 0]
> linux-gate.so.1(__kernel_sigtramp+0x7fffffffff88f740) [0xa000000000040800]
> [merulo:16918] [ 1]
> linux-gate.so.1(__kernel_syscall_via_break+0x7fffffffff88f651)
> [0xa000000000040721]
> [merulo:16918] [ 2] /lib/ia64-linux-gnu/libc.so.6.1(gsignal-0x3112f0)
> [0x200000000049fdf0]
> [merulo:16918] [ 3] /lib/ia64-linux-gnu/libc.so.6.1(abort-0x309710)
> [0x20000000004a79e0]
> [merulo:16918] [ 4] /lib/ia64-linux-gnu/libc.so.6.1(+0x464b0)
> [0x200000000048e4b0]
> [merulo:16918] [ 5]
> /lib/ia64-linux-gnu/libc.so.6.1(__assert_fail-0x322a50) [0x200000000048e6b0]
> [merulo:16918] [ 6] openmpi-1.6.5/debian/tmp/usr/bin/mpirun.openmpi()
> [0x40000000000063d0]
> [merulo:16918] [ 7] openmpi-1.6.5/debian/tmp/usr/bin/mpirun.openmpi()
> [0x4000000000004120]
> [merulo:16918] [ 8]
> /lib/ia64-linux-gnu/libc.so.6.1(__libc_start_main-0x33fe70)
> [0x20000000004712a0]
> [merulo:16918] [ 9] openmpi-1.6.5/debian/tmp/usr/bin/mpirun.openmpi()
> [0x4000000000003f00]
> [merulo:16918] *** End of error message ***
> Aborted
>
>
> On 20/09/2013 23:58, Ralph Castain wrote:
>> Occurs to me - I bet you didn't configure this with --enable-debug, did you? If not, please reconfigure it and rerun so we can see the debug output
>>
>> On Sep 20, 2013, at 2:54 PM, Sylvestre Ledru <sylvestre_at_[hidden]> wrote:
>>
>>> On 20/09/2013 23:46, Ralph Castain wrote:
>>>> That's it?? Wow, that was useless.
>>> Isn't it ? ;) It is why I asked for your help...
>>>> Can you attach to mpirun with gdb and tell me where it is sitting?
>>>>
>>> It is about as useful at the previous command:
>>> http://paste.debian.net/43882/
>>>
>>> Built with:
>>> $ mpicc foo.c -g -o foo -O0
>>>
>>> Sylvestre
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel