Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] environment variables and MPI_Comm_spawn
From: Ralph Castain (rhc_at_[hidden])
Date: 2013-12-19 16:00:44


In trunk, cmr'd for 1.7.4 - copied you on ticket

Thanks!
Ralph

On Dec 19, 2013, at 12:37 PM, tom fogal <tfogal_at_[hidden]> wrote:

> Okay, no worries on the delay, and thanks! -tom
>
> On 12/19/2013 04:32 PM, Ralph Castain wrote:
>> Sorry for delay - buried in my "day job". Adding values to the env array is fine, but this isn't how we would normally do it. I've got it noted on my "to-do" list and will try to get to it in time for 1.7.5
>>
>> Thanks
>> Ralph
>>
>> On Dec 13, 2013, at 4:42 PM, Jeff Squyres (jsquyres) <jsquyres_at_[hidden]> wrote:
>>
>>> Thanks for the first 2 patches, Tom -- I applied them to the SVN trunk and scheduled them to go into the v1.7 series. I don't know if they'll make 1.7.4 or be pushed to 1.7.5, but they'll get there.
>>>
>>> I'll defer to Ralph for the rest of the discussion about info keys.
>>>
>>>
>>> On Dec 13, 2013, at 9:16 AM, tom fogal <tfogal_at_[hidden]> wrote:
>>>
>>>> Hi Ralph, thanks for your help!
>>>>
>>>> Ralph Castain writes:
>>>>> It would have to be done via MPI_Info arguments, and we never had a
>>>>> request to do so (and hence, don't define such an argument). It would
>>>>> be easy enough to do so (look in the ompi/mca/dpm/orte/dpm_orte.c
>>>>> code).
>>>>
>>>> Well, I wanted to just report success, but I've only got the easy
>>>> side of it: saving the arguments from the MPI_Info arguments into
>>>> the orte_job_t struct. See attached "0003" patch (against trunk).
>>>> However, I couldn't figure out how to get the other side: reading out
>>>> the environment variables and setting them at fork. Maybe you could
>>>> help with (or do :-) that?
>>>>
>>>> Or just guide me as to where again: I threw abort()s in 'spawn'
>>>> functions I found under plm/, but my programs didn't abort and so I'm
>>>> not sure where they went.
>>>>
>>>>> MPI implementations generally don't forcibly propagate envars because
>>>>> it is so hard to know which ones to handle - it is easy to propagate
>>>>> a system envar that causes bad things to happen on the remote end.
>>>>
>>>> I understand. Though in this case, I'm /trying/ to make Bad Things
>>>> (tm) happen ;-).
>>>>
>>>>> One thing you could do, of course, is add that envar to your default
>>>>> shell setup (.bashrc or whatever). This would set the variable by
>>>>> default on your remote locations (assuming you are using rsh/ssh
>>>>> for your launcher), and then any process you start would get
>>>>> it. However, that won't help if this is an envar intended only for
>>>>> the comm_spawned process.
>>>>
>>>> Unfortunately what I want to play with at the moment are LD_*
>>>> variables, and fiddling with these in my .bashrc will mess up a lot
>>>> more than just the simulation I am presently hacking.
>>>>
>>>>> I can add this capability to the OMPI trunk, and port it to the 1.7
>>>>> release - but we don't go all the way back to the 1.4 series any
>>>>> more.
>>>>
>>>> Yes, having this in a 1.7 release would be great!
>>>>
>>>>
>>>> BTW, I encountered a couple other small things while grepping through
>>>> source/waiting for trunk to build, so there are two other small patches
>>>> attached. One gets rid of warnings about unused functions in generated
>>>> lexing code. I believe the second fixes resource leaks on error paths.
>>>> However, it turned out none of my user-level code hit that function at
>>>> all, so I haven't been able to test it. Take from it what you will...
>>>>
>>>> -tom
>>>>
>>>>> On Wed, Dec 11, 2013 at 2:10 PM, tom fogal <tfogal_at_[hidden]> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I'm developing on Open MPI 1.4.5-ubuntu2 on Ubuntu 13.10 (so, Ubuntu's
>>>>>> packaged Open MPI) at the moment.
>>>>>>
>>>>>> I'd like to pass environment variables to processes started via
>>>>>> MPI_Comm_spawn. Unfortunately, the MPI 3.0 standard (at least) does
>>>>>> not seem to specify a way to do this; thus I have been searching for
>>>>>> implementation-specific ways to accomplish my task.
>>>>>>
>>>>>> I have tried setting the environment variable using the POSIX setenv(3)
>>>>>> call, but it seems that Open MPI comm-spawn'd processes do not inherit
>>>>>> environment variables. See the attached 2 C99 programs; one prints
>>>>>> out the environment it receives, and one sets the MEANING_OF_LIFE
>>>>>> environment variable, spawns the previous 'env printing' program, and
>>>>>> exits. I run via:
>>>>>>
>>>>>> $ env -i HOME=/home/tfogal \
>>>>>> PATH=/bin:/usr/bin:/usr/local/bin:/sbin:/usr/sbin \
>>>>>> mpirun -x TJFVAR=testing -n 5 ./mpienv ./envpar
>>>>>>
>>>>>> and expect (well, hope) to find the MEANING_OF_LIFE in 'envpar's
>>>>>> output. I do see TJFVAR, but the MEANING_OF_LIFE sadly does not
>>>>>> propagate. Perhaps I am asking the wrong question...
>>>>>>
>>>>>> I found another MPI implementation which allowed passing such
>>>>>> information via the MPI_Info argument, however I could find no
>>>>>> documentation of similar functionality in Open MPI.
>>>>>>
>>>>>> Is there a way to accomplish what I'm looking for? I could even be
>>>>>> convinced to hack source, but a starting pointer would be appreciated.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> -tom
>>>>
>>>> From 8285a7625e5ea014b9d4df5dd65a7642fd4bc322 Mon Sep 17 00:00:00 2001
>>>> From: Tom Fogal <tfogal_at_[hidden]>
>>>> Date: Fri, 13 Dec 2013 12:03:56 +0100
>>>> Subject: [PATCH 1/3] btl: Remove warnings about unused lexing functions.
>>>>
>>>> ---
>>>> ompi/mca/btl/openib/btl_openib_lex.l | 2 ++
>>>> 1 file changed, 2 insertions(+)
>>>>
>>>> diff --git a/ompi/mca/btl/openib/btl_openib_lex.l b/ompi/mca/btl/openib/btl_openib_lex.l
>>>> index 2aa6059..7455b78 100644
>>>> --- a/ompi/mca/btl/openib/btl_openib_lex.l
>>>> +++ b/ompi/mca/btl/openib/btl_openib_lex.l
>>>> @@ -1,3 +1,5 @@
>>>> +%option nounput
>>>> +%option noinput
>>>> %{ /* -*- C -*- */
>>>> /*
>>>> * Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
>>>> --
>>>> 1.8.3.2
>>>>
>>>> From dff9fd5ef69f09de6d0fee2236c39a79e8674f92 Mon Sep 17 00:00:00 2001
>>>> From: Tom Fogal <tfogal_at_[hidden]>
>>>> Date: Fri, 13 Dec 2013 13:06:41 +0100
>>>> Subject: [PATCH 2/3] mca: cleanup buf, ps when errors occur.
>>>>
>>>> ---
>>>> orte/mca/plm/base/plm_base_proxy.c | 4 +++-
>>>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/orte/mca/plm/base/plm_base_proxy.c b/orte/mca/plm/base/plm_base_proxy.c
>>>> index 5d2b100..275cb3a 100644
>>>> --- a/orte/mca/plm/base/plm_base_proxy.c
>>>> +++ b/orte/mca/plm/base/plm_base_proxy.c
>>>> @@ -128,14 +128,15 @@ int orte_plm_proxy_spawn(orte_job_t *jdata)
>>>> command = ORTE_PLM_LAUNCH_JOB_CMD;
>>>> if (ORTE_SUCCESS != (rc = opal_dss.pack(buf, &command, 1, ORTE_PLM_CMD))) {
>>>> ORTE_ERROR_LOG(rc);
>>>> + OBJ_RELEASE(buf);
>>>> goto CLEANUP;
>>>> }
>>>>
>>>> /* pack the jdata object */
>>>> if (ORTE_SUCCESS != (rc = opal_dss.pack(buf, &jdata, 1, ORTE_JOB))) {
>>>> ORTE_ERROR_LOG(rc);
>>>> + OBJ_RELEASE(buf);
>>>> goto CLEANUP;
>>>> -
>>>> }
>>>>
>>>> /* create the proxy spawn object */
>>>> @@ -153,6 +154,7 @@ int orte_plm_proxy_spawn(orte_job_t *jdata)
>>>> orte_rml_send_callback, NULL))) {
>>>> ORTE_ERROR_LOG(rc);
>>>> OBJ_RELEASE(buf);
>>>> + OBJ_RELEASE(ps);
>>>> goto CLEANUP;
>>>> }
>>>>
>>>> --
>>>> 1.8.3.2
>>>>
>>>> From a90f1fb49df1ff9442476b5e4294353ebb94498b Mon Sep 17 00:00:00 2001
>>>> From: Tom Fogal <tfogal_at_[hidden]>
>>>> Date: Fri, 13 Dec 2013 15:09:10 +0100
>>>> Subject: [PATCH 3/3] info: accept env vars desired in child processes
>>>>
>>>> This looks for "env" keys in MPI_Info structures, which should be
>>>> then used to forward environment variables from parent to child
>>>> when spawning jobs. However, note this doesn't (yet) change the
>>>> spawn machinery.
>>>> ---
>>>> ompi/mca/dpm/orte/dpm_orte.c | 12 ++++++++++++
>>>> orte/runtime/orte_globals.c | 2 ++
>>>> orte/runtime/orte_globals.h | 2 ++
>>>> 3 files changed, 16 insertions(+)
>>>>
>>>> diff --git a/ompi/mca/dpm/orte/dpm_orte.c b/ompi/mca/dpm/orte/dpm_orte.c
>>>> index 65099a5..b61d6f2 100644
>>>> --- a/ompi/mca/dpm/orte/dpm_orte.c
>>>> +++ b/ompi/mca/dpm/orte/dpm_orte.c
>>>> @@ -680,6 +680,7 @@ static int spawn(int count, const char *array_of_commands[],
>>>> char mapper[OPAL_PATH_MAX];
>>>> int npernode;
>>>> char slot_list[OPAL_PATH_MAX];
>>>> + char envvar[1024]; /* better magic number? */
>>>>
>>>> orte_job_t *jdata;
>>>> orte_app_context_t *app;
>>>> @@ -705,6 +706,7 @@ static int spawn(int count, const char *array_of_commands[],
>>>> - "path": list of directories where to look for the executable
>>>> - "file": filename, where additional information is provided.
>>>> - "soft": see page 92 of MPI-2.
>>>> + - "env": environment variables desired in the children.
>>>> */
>>>>
>>>> /* setup the job object */
>>>> @@ -1358,6 +1360,16 @@ static int spawn(int count, const char *array_of_commands[],
>>>> jdata->stdin_target = strtoul(stdin_target, NULL, 10);
>>>> }
>>>> }
>>>> +
>>>> + /* did the user want us to forward any environment variables? */
>>>> + ompi_info_get (array_of_info[i], "env", sizeof(envvar)-1, envvar,
>>>> + &flag);
>>>> + if ( flag ) {
>>>> + jdata->nenv_vars++;
>>>> + jdata->env_vars = realloc(jdata->env_vars,
>>>> + jdata->nenv_vars*sizeof(char*));
>>>> + jdata->env_vars[jdata->nenv_vars-1] = strdup(envvar);
>>>> + }
>>>> }
>>>>
>>>> /* default value: If the user did not tell us where to look for the
>>>> diff --git a/orte/runtime/orte_globals.c b/orte/runtime/orte_globals.c
>>>> index f3e3029..e4ba975 100644
>>>> --- a/orte/runtime/orte_globals.c
>>>> +++ b/orte/runtime/orte_globals.c
>>>> @@ -742,6 +742,8 @@ static void orte_job_construct(orte_job_t* job)
>>>> job->ckpt_snapshot_ref = NULL;
>>>> job->ckpt_snapshot_loc = NULL;
>>>> #endif
>>>> + job->env_vars = NULL;
>>>> + job->nenv_vars = 0;
>>>> }
>>>>
>>>> static void orte_job_destruct(orte_job_t* job)
>>>> diff --git a/orte/runtime/orte_globals.h b/orte/runtime/orte_globals.h
>>>> index f284045..d12296b 100644
>>>> --- a/orte/runtime/orte_globals.h
>>>> +++ b/orte/runtime/orte_globals.h
>>>> @@ -463,6 +463,8 @@ typedef struct {
>>>> /* snapshot location */
>>>> char *ckpt_snapshot_loc;
>>>> #endif
>>>> + char** env_vars;
>>>> + size_t nenv_vars;
>>>> } orte_job_t;
>>>> ORTE_DECLSPEC OBJ_CLASS_DECLARATION(orte_job_t);
>>>>
>>>> --
>>>> 1.8.3.2
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>> --
>>> Jeff Squyres
>>> jsquyres_at_[hidden]
>>> For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users