Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] crash when using coll_tuned_use_dynamic_rules option with 1.4
From: George Bosilca (bosilca_at_[hidden])
Date: 2010-01-29 04:06:09


r22510 solves this problem.

  george.

On Jan 24, 2010, at 01:35 , Lenny Verkhovsky wrote:

> It's a known issue.
> try to provide file with rules.
> https://svn.open-mpi.org/trac/ompi/ticket/2087
> Lenny.
> On Fri, Jan 22, 2010 at 8:25 PM, Holger Berger <hberger_at_[hidden]> wrote:
> Hi,
>
> I tracked this down a bit, and my impression is that this piece of code in
> coll_tuned_component.c
>
> if (ompi_coll_tuned_use_dynamic_rules) {
> mca_base_param_reg_string(&mca_coll_tuned_component.super.collm_version,
> "dynamic_rules_filename",
> "Filename of configuration file that contains the dynamic (@runtime) decision function rules",
> false, false, ompi_coll_tuned_dynamic_rules_filename,
> &ompi_coll_tuned_dynamic_rules_filename);
> if( ompi_coll_tuned_dynamic_rules_filename ) {
> OPAL_OUTPUT((ompi_coll_tuned_stream,"coll:tuned:component_open Reading collective rules file [%s]",
> ompi_coll_tuned_dynamic_rules_filename));
> rc = ompi_coll_tuned_read_rules_config_file( ompi_coll_tuned_dynamic_rules_filename,
> &(mca_coll_tuned_component.all_base_rules), COLLCOUNT);
> if( rc >= 0 ) {
> OPAL_OUTPUT((ompi_coll_tuned_stream,"coll:tuned:module_open Read %d valid rules\n", rc));
> } else {
> OPAL_OUTPUT((ompi_coll_tuned_stream,"coll:tuned:module_open Reading collective rules file failed\n"));
> mca_coll_tuned_component.all_base_rules = NULL;
> }
> }
> ....
> }
>
> Does not initialize the msg_rules as ompi_coll_tuned_read_rules_config_file does it by calling
> ompi_coll_tuned_mk_msg_rules in the case that
>
> ompi_coll_tuned_use_dynamic_rules is TRUE
> and
> ompi_coll_tuned_dynamic_rules_filename is FALSE
>
> which leads to a crash in line
> if( (NULL == base_com_rule) || (0 == base_com_rule->n_msg_sizes))
> in coll_tuned_dynamic_rules.c:361
> as base_com_rule seems to unitialized, but NOT zero, and points somewhere...
>
>
> That is probably not inteneded, as it prohibits the selection of an algorithm
> by switch like -mca coll_tuned_alltoall_algorithm 2.
>
> Hope that helps fixing it...
>
>
>
>
>
> --
> Holger Berger
> System Integration and Support
> HPCE Division NEC Deutschland GmbH
> Tel: +49-711-6877035 hberger_at_[hidden]
> Fax: +49-711-6877145 http://www.nec.com/de
> NEC Deutschland GmbH, Hansaallee 101, 40549 Düsseldorf
> Geschäftsführer Yuya Momose
> Handelsregister Düsseldorf HRB 57941; VAT ID DE129424743
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel