Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] usage of mca variables in orte-restart
From: Adrian Reber (adrian_at_[hidden])
Date: 2014-03-18 04:11:10


Thanks for your fix.

You say that the environment is only taken in
account during register. There is another variable set in the
environment in opal-restart.c. Does the following still work:

opal-restart.c:

    (void) mca_base_var_env_name("crs", &tmp_env_var);
    opal_setenv(tmp_env_var,
                expected_crs_comp,
                true, &environ);
    free(tmp_env_var);
    tmp_env_var = NULL;

The preferred checkpointer is selected like this and in
opal_crs_base_select() the following happens:

    if( OPAL_SUCCESS != mca_base_select("crs", opal_crs_base_framework.framework_output,
                                        &opal_crs_base_framework.framework_components,
                                        (mca_base_module_t **) &best_module,
                                        (mca_base_component_t **) &best_component) ) {
        /* This will only happen if no component was selected */
        exit_status = OPAL_ERROR;
        goto cleanup;
    }

Does the mca_base_var_env_name() influence which crs module
is selected during mca_base_select()? Or do I have to change it
also to mca_base_var_set_value() to select the preferred crs module?

                Adrian

On Mon, Mar 17, 2014 at 08:47:16AM -0600, Nathan Hjelm wrote:
> Good catch. Fixing now.
>
> -Nathan
>
> On Mon, Mar 17, 2014 at 02:50:02PM +0100, Adrian Reber wrote:
> > On Fri, Mar 14, 2014 at 10:18:06PM +0000, Hjelm, Nathan T wrote:
> > > The preferred way is to use mca_base_var_find and then call mca_base_var_[set|get]_value. For performance sake we only look at the environment when the variable is registered.
> >
> > I believe I found a bug in mca_base_var_set_value using bool variables:
> >
> > #0 0x00007f6e0d8fb800 in mca_base_var_enum_bool_sfv (self=0x7f6e0dbabc20 <mca_base_var_enum_bool>, value=0,
> > string_value=0x0) at ../../../../opal/mca/base/mca_base_var_enum.c:82
> > #1 0x00007f6e0d8f45d6 in mca_base_var_set_value (vari=120, value=0x4031e6, size=0, source=MCA_BASE_VAR_SOURCE_DEFAULT,
> > source_file=0x0) at ../../../../opal/mca/base/mca_base_var.c:636
> > #2 0x0000000000401e44 in main (argc=7, argv=0x7fffa72a0a78) at ../../../../opal/tools/opal-restart/opal-restart.c:223
> >
> > I am using set_value like this:
> >
> > bool test=false;
> > mca_base_var_set_value(idx, &test, 0, MCA_BASE_VAR_SOURCE_DEFAULT, NULL);
> >
> > As the size is ignored I am just setting it to '0'.
> >
> > mca_base_var_set_value() does
> >
> > ret = var->mbv_enumerator->string_from_value(var->mbv_enumerator,((int *) value)[0], NULL);
> >
> > which calls mca_base_var_enum_bool_sfv() with the last parameter set to NULL:
> >
> > static int mca_base_var_enum_bool_sfv (mca_base_var_enum_t *self, const int value,
> > const char **string_value)
> > {
> > *string_value = value ? "true" : "false";
> >
> > return OPAL_SUCCESS;
> > }
> >
> > and here it tries to access the last parameter (string_value) which has
> > been set to NULL. As I cannot find any usage of mca_base_var_set_value()
> > with bool variables this code path has probably not been used until now.
> >
> > Adrian
> > _______________________________________________
> > devel mailing list
> > devel_at_[hidden]
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post: http://www.open-mpi.org/community/lists/devel/2014/03/14354.php

> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: http://www.open-mpi.org/community/lists/devel/2014/03/14355.php

                Adrian

-- 
Adrian Reber <adrian_at_[hidden]>            http://lisas.de/~adrian/
printk(KERN_ERR "msp3400: chip reset failed, penguin on i2c bus?\n");
	2.2.16 /usr/src/linux/drivers/char/msp3400.c


  • application/pgp-signature attachment: stored