Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] segv in coll tuned
From: Lenny Verkhovsky (lenny.verkhovsky_at_[hidden])
Date: 2009-10-12 09:48:10


well, I see that it returnes 0 at this line,
since base_com_rule->n_msg_sizes==0
coll_tuned_dynamic_rules.c +359
if( (NULL == base_com_rule) || (0 == base_com_rule->n_msg_sizes)) {
  return (0);
  }

Sometimes it passes if I tell IMB -npmin 4.
On Mon, Oct 12, 2009 at 3:37 PM, Lenny Verkhovsky <
lenny.verkhovsky_at_[hidden]> wrote:

> not since I started testing it :)
> it failes somewhere in ompi_coll_tuned_get_target_method_params function, I
> am taking a look right now.
>
> On Mon, Oct 12, 2009 at 3:33 PM, Terry Dontje <Terry.Dontje_at_[hidden]>wrote:
>
>> Does that test also pass sometimes? I am seeing some random set of tests
>> segv'ing in the SM btl, using a v1.3 derivative.
>>
>> --td
>> Lenny Verkhovsky wrote:
>>
>>> Hi,
>>> I experience the following error with current trunk r22090. It also
>>> occures in 1.3 branch.
>>> #~/work/svn/ompi/branches/1.3//build_x86-64/install/bin/mpirun -H witch21
>>> -np 4 -mca coll_tuned_use_dynamic_rules 1 ./IMB-MPI1 Sometimes it's error,
>>> and sometimes it's segv. It recreates with np>4.
>>> [witch21:26540] *** An error occurred in MPI_Barrier
>>> [witch21:26540] *** on communicator MPI COMMUNICATOR 3 SPLIT FROM 0
>>> [witch21:26540] *** MPI_ERR_ARG: invalid argument of some other kind
>>> [witch21:26540] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
>>>
>>> --------------------------------------------------------------------------
>>> mpirun has exited due to process rank 0 with PID 26540 on
>>> node witch21 exiting without calling "finalize". This may
>>> have caused other processes in the application to be
>>> terminated by signals sent by mpirun (as reported here).
>>>
>>> --------------------------------------------------------------------------
>>> 3 total processes killed (some possibly by mpirun during cleanup)
>>>
>>> thanks
>>> Lenny.
>>> ------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>
>