Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] environment variables and MPI_Comm_spawn
From: tom fogal (tfogal_at_[hidden])
Date: 2013-12-19 15:37:56


Okay, no worries on the delay, and thanks! -tom

On 12/19/2013 04:32 PM, Ralph Castain wrote:
> Sorry for delay - buried in my "day job". Adding values to the env array is fine, but this isn't how we would normally do it. I've got it noted on my "to-do" list and will try to get to it in time for 1.7.5
>
> Thanks
> Ralph
>
> On Dec 13, 2013, at 4:42 PM, Jeff Squyres (jsquyres) <jsquyres_at_[hidden]> wrote:
>
>> Thanks for the first 2 patches, Tom -- I applied them to the SVN trunk and scheduled them to go into the v1.7 series. I don't know if they'll make 1.7.4 or be pushed to 1.7.5, but they'll get there.
>>
>> I'll defer to Ralph for the rest of the discussion about info keys.
>>
>>
>> On Dec 13, 2013, at 9:16 AM, tom fogal <tfogal_at_[hidden]> wrote:
>>
>>> Hi Ralph, thanks for your help!
>>>
>>> Ralph Castain writes:
>>>> It would have to be done via MPI_Info arguments, and we never had a
>>>> request to do so (and hence, don't define such an argument). It would
>>>> be easy enough to do so (look in the ompi/mca/dpm/orte/dpm_orte.c
>>>> code).
>>>
>>> Well, I wanted to just report success, but I've only got the easy
>>> side of it: saving the arguments from the MPI_Info arguments into
>>> the orte_job_t struct. See attached "0003" patch (against trunk).
>>> However, I couldn't figure out how to get the other side: reading out
>>> the environment variables and setting them at fork. Maybe you could
>>> help with (or do :-) that?
>>>
>>> Or just guide me as to where again: I threw abort()s in 'spawn'
>>> functions I found under plm/, but my programs didn't abort and so I'm
>>> not sure where they went.
>>>
>>>> MPI implementations generally don't forcibly propagate envars because
>>>> it is so hard to know which ones to handle - it is easy to propagate
>>>> a system envar that causes bad things to happen on the remote end.
>>>
>>> I understand. Though in this case, I'm /trying/ to make Bad Things
>>> (tm) happen ;-).
>>>
>>>> One thing you could do, of course, is add that envar to your default
>>>> shell setup (.bashrc or whatever). This would set the variable by
>>>> default on your remote locations (assuming you are using rsh/ssh
>>>> for your launcher), and then any process you start would get
>>>> it. However, that won't help if this is an envar intended only for
>>>> the comm_spawned process.
>>>
>>> Unfortunately what I want to play with at the moment are LD_*
>>> variables, and fiddling with these in my .bashrc will mess up a lot
>>> more than just the simulation I am presently hacking.
>>>
>>>> I can add this capability to the OMPI trunk, and port it to the 1.7
>>>> release - but we don't go all the way back to the 1.4 series any
>>>> more.
>>>
>>> Yes, having this in a 1.7 release would be great!
>>>
>>>
>>> BTW, I encountered a couple other small things while grepping through
>>> source/waiting for trunk to build, so there are two other small patches
>>> attached. One gets rid of warnings about unused functions in generated
>>> lexing code. I believe the second fixes resource leaks on error paths.
>>> However, it turned out none of my user-level code hit that function at
>>> all, so I haven't been able to test it. Take from it what you will...
>>>
>>> -tom
>>>
>>>> On Wed, Dec 11, 2013 at 2:10 PM, tom fogal <tfogal_at_[hidden]> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> I'm developing on Open MPI 1.4.5-ubuntu2 on Ubuntu 13.10 (so, Ubuntu's
>>>>> packaged Open MPI) at the moment.
>>>>>
>>>>> I'd like to pass environment variables to processes started via
>>>>> MPI_Comm_spawn. Unfortunately, the MPI 3.0 standard (at least) does
>>>>> not seem to specify a way to do this; thus I have been searching for
>>>>> implementation-specific ways to accomplish my task.
>>>>>
>>>>> I have tried setting the environment variable using the POSIX setenv(3)
>>>>> call, but it seems that Open MPI comm-spawn'd processes do not inherit
>>>>> environment variables. See the attached 2 C99 programs; one prints
>>>>> out the environment it receives, and one sets the MEANING_OF_LIFE
>>>>> environment variable, spawns the previous 'env printing' program, and
>>>>> exits. I run via:
>>>>>
>>>>> $ env -i HOME=/home/tfogal \
>>>>> PATH=/bin:/usr/bin:/usr/local/bin:/sbin:/usr/sbin \
>>>>> mpirun -x TJFVAR=testing -n 5 ./mpienv ./envpar
>>>>>
>>>>> and expect (well, hope) to find the MEANING_OF_LIFE in 'envpar's
>>>>> output. I do see TJFVAR, but the MEANING_OF_LIFE sadly does not
>>>>> propagate. Perhaps I am asking the wrong question...
>>>>>
>>>>> I found another MPI implementation which allowed passing such
>>>>> information via the MPI_Info argument, however I could find no
>>>>> documentation of similar functionality in Open MPI.
>>>>>
>>>>> Is there a way to accomplish what I'm looking for? I could even be
>>>>> convinced to hack source, but a starting pointer would be appreciated.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> -tom
>>>
>>> From 8285a7625e5ea014b9d4df5dd65a7642fd4bc322 Mon Sep 17 00:00:00 2001
>>> From: Tom Fogal <tfogal_at_[hidden]>
>>> Date: Fri, 13 Dec 2013 12:03:56 +0100
>>> Subject: [PATCH 1/3] btl: Remove warnings about unused lexing functions.
>>>
>>> ---
>>> ompi/mca/btl/openib/btl_openib_lex.l | 2 ++
>>> 1 file changed, 2 insertions(+)
>>>
>>> diff --git a/ompi/mca/btl/openib/btl_openib_lex.l b/ompi/mca/btl/openib/btl_openib_lex.l
>>> index 2aa6059..7455b78 100644
>>> --- a/ompi/mca/btl/openib/btl_openib_lex.l
>>> +++ b/ompi/mca/btl/openib/btl_openib_lex.l
>>> @@ -1,3 +1,5 @@
>>> +%option nounput
>>> +%option noinput
>>> %{ /* -*- C -*- */
>>> /*
>>> * Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
>>> --
>>> 1.8.3.2
>>>
>>> From dff9fd5ef69f09de6d0fee2236c39a79e8674f92 Mon Sep 17 00:00:00 2001
>>> From: Tom Fogal <tfogal_at_[hidden]>
>>> Date: Fri, 13 Dec 2013 13:06:41 +0100
>>> Subject: [PATCH 2/3] mca: cleanup buf, ps when errors occur.
>>>
>>> ---
>>> orte/mca/plm/base/plm_base_proxy.c | 4 +++-
>>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/orte/mca/plm/base/plm_base_proxy.c b/orte/mca/plm/base/plm_base_proxy.c
>>> index 5d2b100..275cb3a 100644
>>> --- a/orte/mca/plm/base/plm_base_proxy.c
>>> +++ b/orte/mca/plm/base/plm_base_proxy.c
>>> @@ -128,14 +128,15 @@ int orte_plm_proxy_spawn(orte_job_t *jdata)
>>> command = ORTE_PLM_LAUNCH_JOB_CMD;
>>> if (ORTE_SUCCESS != (rc = opal_dss.pack(buf, &command, 1, ORTE_PLM_CMD))) {
>>> ORTE_ERROR_LOG(rc);
>>> + OBJ_RELEASE(buf);
>>> goto CLEANUP;
>>> }
>>>
>>> /* pack the jdata object */
>>> if (ORTE_SUCCESS != (rc = opal_dss.pack(buf, &jdata, 1, ORTE_JOB))) {
>>> ORTE_ERROR_LOG(rc);
>>> + OBJ_RELEASE(buf);
>>> goto CLEANUP;
>>> -
>>> }
>>>
>>> /* create the proxy spawn object */
>>> @@ -153,6 +154,7 @@ int orte_plm_proxy_spawn(orte_job_t *jdata)
>>> orte_rml_send_callback, NULL))) {
>>> ORTE_ERROR_LOG(rc);
>>> OBJ_RELEASE(buf);
>>> + OBJ_RELEASE(ps);
>>> goto CLEANUP;
>>> }
>>>
>>> --
>>> 1.8.3.2
>>>
>>> From a90f1fb49df1ff9442476b5e4294353ebb94498b Mon Sep 17 00:00:00 2001
>>> From: Tom Fogal <tfogal_at_[hidden]>
>>> Date: Fri, 13 Dec 2013 15:09:10 +0100
>>> Subject: [PATCH 3/3] info: accept env vars desired in child processes
>>>
>>> This looks for "env" keys in MPI_Info structures, which should be
>>> then used to forward environment variables from parent to child
>>> when spawning jobs. However, note this doesn't (yet) change the
>>> spawn machinery.
>>> ---
>>> ompi/mca/dpm/orte/dpm_orte.c | 12 ++++++++++++
>>> orte/runtime/orte_globals.c | 2 ++
>>> orte/runtime/orte_globals.h | 2 ++
>>> 3 files changed, 16 insertions(+)
>>>
>>> diff --git a/ompi/mca/dpm/orte/dpm_orte.c b/ompi/mca/dpm/orte/dpm_orte.c
>>> index 65099a5..b61d6f2 100644
>>> --- a/ompi/mca/dpm/orte/dpm_orte.c
>>> +++ b/ompi/mca/dpm/orte/dpm_orte.c
>>> @@ -680,6 +680,7 @@ static int spawn(int count, const char *array_of_commands[],
>>> char mapper[OPAL_PATH_MAX];
>>> int npernode;
>>> char slot_list[OPAL_PATH_MAX];
>>> + char envvar[1024]; /* better magic number? */
>>>
>>> orte_job_t *jdata;
>>> orte_app_context_t *app;
>>> @@ -705,6 +706,7 @@ static int spawn(int count, const char *array_of_commands[],
>>> - "path": list of directories where to look for the executable
>>> - "file": filename, where additional information is provided.
>>> - "soft": see page 92 of MPI-2.
>>> + - "env": environment variables desired in the children.
>>> */
>>>
>>> /* setup the job object */
>>> @@ -1358,6 +1360,16 @@ static int spawn(int count, const char *array_of_commands[],
>>> jdata->stdin_target = strtoul(stdin_target, NULL, 10);
>>> }
>>> }
>>> +
>>> + /* did the user want us to forward any environment variables? */
>>> + ompi_info_get (array_of_info[i], "env", sizeof(envvar)-1, envvar,
>>> + &flag);
>>> + if ( flag ) {
>>> + jdata->nenv_vars++;
>>> + jdata->env_vars = realloc(jdata->env_vars,
>>> + jdata->nenv_vars*sizeof(char*));
>>> + jdata->env_vars[jdata->nenv_vars-1] = strdup(envvar);
>>> + }
>>> }
>>>
>>> /* default value: If the user did not tell us where to look for the
>>> diff --git a/orte/runtime/orte_globals.c b/orte/runtime/orte_globals.c
>>> index f3e3029..e4ba975 100644
>>> --- a/orte/runtime/orte_globals.c
>>> +++ b/orte/runtime/orte_globals.c
>>> @@ -742,6 +742,8 @@ static void orte_job_construct(orte_job_t* job)
>>> job->ckpt_snapshot_ref = NULL;
>>> job->ckpt_snapshot_loc = NULL;
>>> #endif
>>> + job->env_vars = NULL;
>>> + job->nenv_vars = 0;
>>> }
>>>
>>> static void orte_job_destruct(orte_job_t* job)
>>> diff --git a/orte/runtime/orte_globals.h b/orte/runtime/orte_globals.h
>>> index f284045..d12296b 100644
>>> --- a/orte/runtime/orte_globals.h
>>> +++ b/orte/runtime/orte_globals.h
>>> @@ -463,6 +463,8 @@ typedef struct {
>>> /* snapshot location */
>>> char *ckpt_snapshot_loc;
>>> #endif
>>> + char** env_vars;
>>> + size_t nenv_vars;
>>> } orte_job_t;
>>> ORTE_DECLSPEC OBJ_CLASS_DECLARATION(orte_job_t);
>>>
>>> --
>>> 1.8.3.2
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> --
>> Jeff Squyres
>> jsquyres_at_[hidden]
>> For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>