Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] environment variables and MPI_Comm_spawn
From: Jeff Squyres (jsquyres) (jsquyres_at_[hidden])
Date: 2013-12-13 19:42:34


Thanks for the first 2 patches, Tom -- I applied them to the SVN trunk and scheduled them to go into the v1.7 series. I don't know if they'll make 1.7.4 or be pushed to 1.7.5, but they'll get there.

I'll defer to Ralph for the rest of the discussion about info keys.

On Dec 13, 2013, at 9:16 AM, tom fogal <tfogal_at_[hidden]> wrote:

> Hi Ralph, thanks for your help!
>
> Ralph Castain writes:
>> It would have to be done via MPI_Info arguments, and we never had a
>> request to do so (and hence, don't define such an argument). It would
>> be easy enough to do so (look in the ompi/mca/dpm/orte/dpm_orte.c
>> code).
>
> Well, I wanted to just report success, but I've only got the easy
> side of it: saving the arguments from the MPI_Info arguments into
> the orte_job_t struct. See attached "0003" patch (against trunk).
> However, I couldn't figure out how to get the other side: reading out
> the environment variables and setting them at fork. Maybe you could
> help with (or do :-) that?
>
> Or just guide me as to where again: I threw abort()s in 'spawn'
> functions I found under plm/, but my programs didn't abort and so I'm
> not sure where they went.
>
>> MPI implementations generally don't forcibly propagate envars because
>> it is so hard to know which ones to handle - it is easy to propagate
>> a system envar that causes bad things to happen on the remote end.
>
> I understand. Though in this case, I'm /trying/ to make Bad Things
> (tm) happen ;-).
>
>> One thing you could do, of course, is add that envar to your default
>> shell setup (.bashrc or whatever). This would set the variable by
>> default on your remote locations (assuming you are using rsh/ssh
>> for your launcher), and then any process you start would get
>> it. However, that won't help if this is an envar intended only for
>> the comm_spawned process.
>
> Unfortunately what I want to play with at the moment are LD_*
> variables, and fiddling with these in my .bashrc will mess up a lot
> more than just the simulation I am presently hacking.
>
>> I can add this capability to the OMPI trunk, and port it to the 1.7
>> release - but we don't go all the way back to the 1.4 series any
>> more.
>
> Yes, having this in a 1.7 release would be great!
>
>
> BTW, I encountered a couple other small things while grepping through
> source/waiting for trunk to build, so there are two other small patches
> attached. One gets rid of warnings about unused functions in generated
> lexing code. I believe the second fixes resource leaks on error paths.
> However, it turned out none of my user-level code hit that function at
> all, so I haven't been able to test it. Take from it what you will...
>
> -tom
>
>> On Wed, Dec 11, 2013 at 2:10 PM, tom fogal <tfogal_at_[hidden]> wrote:
>>
>>> Hi all,
>>>
>>> I'm developing on Open MPI 1.4.5-ubuntu2 on Ubuntu 13.10 (so, Ubuntu's
>>> packaged Open MPI) at the moment.
>>>
>>> I'd like to pass environment variables to processes started via
>>> MPI_Comm_spawn. Unfortunately, the MPI 3.0 standard (at least) does
>>> not seem to specify a way to do this; thus I have been searching for
>>> implementation-specific ways to accomplish my task.
>>>
>>> I have tried setting the environment variable using the POSIX setenv(3)
>>> call, but it seems that Open MPI comm-spawn'd processes do not inherit
>>> environment variables. See the attached 2 C99 programs; one prints
>>> out the environment it receives, and one sets the MEANING_OF_LIFE
>>> environment variable, spawns the previous 'env printing' program, and
>>> exits. I run via:
>>>
>>> $ env -i HOME=/home/tfogal \
>>> PATH=/bin:/usr/bin:/usr/local/bin:/sbin:/usr/sbin \
>>> mpirun -x TJFVAR=testing -n 5 ./mpienv ./envpar
>>>
>>> and expect (well, hope) to find the MEANING_OF_LIFE in 'envpar's
>>> output. I do see TJFVAR, but the MEANING_OF_LIFE sadly does not
>>> propagate. Perhaps I am asking the wrong question...
>>>
>>> I found another MPI implementation which allowed passing such
>>> information via the MPI_Info argument, however I could find no
>>> documentation of similar functionality in Open MPI.
>>>
>>> Is there a way to accomplish what I'm looking for? I could even be
>>> convinced to hack source, but a starting pointer would be appreciated.
>>>
>>> Thanks,
>>>
>>> -tom
>
> From 8285a7625e5ea014b9d4df5dd65a7642fd4bc322 Mon Sep 17 00:00:00 2001
> From: Tom Fogal <tfogal_at_[hidden]>
> Date: Fri, 13 Dec 2013 12:03:56 +0100
> Subject: [PATCH 1/3] btl: Remove warnings about unused lexing functions.
>
> ---
> ompi/mca/btl/openib/btl_openib_lex.l | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/ompi/mca/btl/openib/btl_openib_lex.l b/ompi/mca/btl/openib/btl_openib_lex.l
> index 2aa6059..7455b78 100644
> --- a/ompi/mca/btl/openib/btl_openib_lex.l
> +++ b/ompi/mca/btl/openib/btl_openib_lex.l
> @@ -1,3 +1,5 @@
> +%option nounput
> +%option noinput
> %{ /* -*- C -*- */
> /*
> * Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
> --
> 1.8.3.2
>
> From dff9fd5ef69f09de6d0fee2236c39a79e8674f92 Mon Sep 17 00:00:00 2001
> From: Tom Fogal <tfogal_at_[hidden]>
> Date: Fri, 13 Dec 2013 13:06:41 +0100
> Subject: [PATCH 2/3] mca: cleanup buf, ps when errors occur.
>
> ---
> orte/mca/plm/base/plm_base_proxy.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/orte/mca/plm/base/plm_base_proxy.c b/orte/mca/plm/base/plm_base_proxy.c
> index 5d2b100..275cb3a 100644
> --- a/orte/mca/plm/base/plm_base_proxy.c
> +++ b/orte/mca/plm/base/plm_base_proxy.c
> @@ -128,14 +128,15 @@ int orte_plm_proxy_spawn(orte_job_t *jdata)
> command = ORTE_PLM_LAUNCH_JOB_CMD;
> if (ORTE_SUCCESS != (rc = opal_dss.pack(buf, &command, 1, ORTE_PLM_CMD))) {
> ORTE_ERROR_LOG(rc);
> + OBJ_RELEASE(buf);
> goto CLEANUP;
> }
>
> /* pack the jdata object */
> if (ORTE_SUCCESS != (rc = opal_dss.pack(buf, &jdata, 1, ORTE_JOB))) {
> ORTE_ERROR_LOG(rc);
> + OBJ_RELEASE(buf);
> goto CLEANUP;
> -
> }
>
> /* create the proxy spawn object */
> @@ -153,6 +154,7 @@ int orte_plm_proxy_spawn(orte_job_t *jdata)
> orte_rml_send_callback, NULL))) {
> ORTE_ERROR_LOG(rc);
> OBJ_RELEASE(buf);
> + OBJ_RELEASE(ps);
> goto CLEANUP;
> }
>
> --
> 1.8.3.2
>
> From a90f1fb49df1ff9442476b5e4294353ebb94498b Mon Sep 17 00:00:00 2001
> From: Tom Fogal <tfogal_at_[hidden]>
> Date: Fri, 13 Dec 2013 15:09:10 +0100
> Subject: [PATCH 3/3] info: accept env vars desired in child processes
>
> This looks for "env" keys in MPI_Info structures, which should be
> then used to forward environment variables from parent to child
> when spawning jobs. However, note this doesn't (yet) change the
> spawn machinery.
> ---
> ompi/mca/dpm/orte/dpm_orte.c | 12 ++++++++++++
> orte/runtime/orte_globals.c | 2 ++
> orte/runtime/orte_globals.h | 2 ++
> 3 files changed, 16 insertions(+)
>
> diff --git a/ompi/mca/dpm/orte/dpm_orte.c b/ompi/mca/dpm/orte/dpm_orte.c
> index 65099a5..b61d6f2 100644
> --- a/ompi/mca/dpm/orte/dpm_orte.c
> +++ b/ompi/mca/dpm/orte/dpm_orte.c
> @@ -680,6 +680,7 @@ static int spawn(int count, const char *array_of_commands[],
> char mapper[OPAL_PATH_MAX];
> int npernode;
> char slot_list[OPAL_PATH_MAX];
> + char envvar[1024]; /* better magic number? */
>
> orte_job_t *jdata;
> orte_app_context_t *app;
> @@ -705,6 +706,7 @@ static int spawn(int count, const char *array_of_commands[],
> - "path": list of directories where to look for the executable
> - "file": filename, where additional information is provided.
> - "soft": see page 92 of MPI-2.
> + - "env": environment variables desired in the children.
> */
>
> /* setup the job object */
> @@ -1358,6 +1360,16 @@ static int spawn(int count, const char *array_of_commands[],
> jdata->stdin_target = strtoul(stdin_target, NULL, 10);
> }
> }
> +
> + /* did the user want us to forward any environment variables? */
> + ompi_info_get (array_of_info[i], "env", sizeof(envvar)-1, envvar,
> + &flag);
> + if ( flag ) {
> + jdata->nenv_vars++;
> + jdata->env_vars = realloc(jdata->env_vars,
> + jdata->nenv_vars*sizeof(char*));
> + jdata->env_vars[jdata->nenv_vars-1] = strdup(envvar);
> + }
> }
>
> /* default value: If the user did not tell us where to look for the
> diff --git a/orte/runtime/orte_globals.c b/orte/runtime/orte_globals.c
> index f3e3029..e4ba975 100644
> --- a/orte/runtime/orte_globals.c
> +++ b/orte/runtime/orte_globals.c
> @@ -742,6 +742,8 @@ static void orte_job_construct(orte_job_t* job)
> job->ckpt_snapshot_ref = NULL;
> job->ckpt_snapshot_loc = NULL;
> #endif
> + job->env_vars = NULL;
> + job->nenv_vars = 0;
> }
>
> static void orte_job_destruct(orte_job_t* job)
> diff --git a/orte/runtime/orte_globals.h b/orte/runtime/orte_globals.h
> index f284045..d12296b 100644
> --- a/orte/runtime/orte_globals.h
> +++ b/orte/runtime/orte_globals.h
> @@ -463,6 +463,8 @@ typedef struct {
> /* snapshot location */
> char *ckpt_snapshot_loc;
> #endif
> + char** env_vars;
> + size_t nenv_vars;
> } orte_job_t;
> ORTE_DECLSPEC OBJ_CLASS_DECLARATION(orte_job_t);
>
> --
> 1.8.3.2
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/