Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] crash when using coll_tuned_use_dynamic_rules option with 1.4
From: Holger Berger (hberger_at_[hidden])
Date: 2010-01-22 13:25:30


Hi,

I tracked this down a bit, and my impression is that this piece of code in
coll_tuned_component.c

    if (ompi_coll_tuned_use_dynamic_rules) {
        mca_base_param_reg_string(&mca_coll_tuned_component.super.collm_version,
                                  "dynamic_rules_filename",
                                  "Filename of configuration file that contains the dynamic (@runtime) decision function rules",
                                  false, false, ompi_coll_tuned_dynamic_rules_filename,
                                  &ompi_coll_tuned_dynamic_rules_filename);
        if( ompi_coll_tuned_dynamic_rules_filename ) {
            OPAL_OUTPUT((ompi_coll_tuned_stream,"coll:tuned:component_open Reading collective rules file [%s]",
                         ompi_coll_tuned_dynamic_rules_filename));
            rc = ompi_coll_tuned_read_rules_config_file( ompi_coll_tuned_dynamic_rules_filename,
                                                         &(mca_coll_tuned_component.all_base_rules), COLLCOUNT);
            if( rc >= 0 ) {
                OPAL_OUTPUT((ompi_coll_tuned_stream,"coll:tuned:module_open Read %d valid rules\n", rc));
            } else {
                OPAL_OUTPUT((ompi_coll_tuned_stream,"coll:tuned:module_open Reading collective rules file failed\n"));
                mca_coll_tuned_component.all_base_rules = NULL;
            }
        }
        ....
        }

Does not initialize the msg_rules as ompi_coll_tuned_read_rules_config_file does it by calling
ompi_coll_tuned_mk_msg_rules in the case that

ompi_coll_tuned_use_dynamic_rules is TRUE
and
ompi_coll_tuned_dynamic_rules_filename is FALSE

which leads to a crash in line
  if( (NULL == base_com_rule) || (0 == base_com_rule->n_msg_sizes))
in coll_tuned_dynamic_rules.c:361
as base_com_rule seems to unitialized, but NOT zero, and points somewhere...

That is probably not inteneded, as it prohibits the selection of an algorithm
by switch like -mca coll_tuned_alltoall_algorithm 2.

Hope that helps fixing it...

-- 
Holger Berger
System Integration and Support
HPCE Division NEC Deutschland GmbH
Tel: +49-711-6877035 hberger_at_[hidden]
Fax: +49-711-6877145 http://www.nec.com/de
NEC Deutschland GmbH, Hansaallee 101, 40549 Düsseldorf
Geschäftsführer Yuya Momose
Handelsregister Düsseldorf HRB 57941; VAT ID DE129424743