Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] [OMPI svn] svn:open-mpi r21513
From: Ralph Castain (rhc_at_[hidden])
Date: 2009-06-24 18:50:15


I believe you are using a bad example here George. If you look closely
at the code, you will see that we treat the orte_launch_agent
separately from everything else - it gets passed through the following
code:

int orte_plm_base_setup_orted_cmd(int *argc, char ***argv)
{
     int i, loc;
     char **tmpv;

     /* set default location */
     loc = -1;
     /* split the command apart in case it is multi-word */
     tmpv = opal_argv_split(orte_launch_agent, ' ');
     for (i = 0; NULL != tmpv && NULL != tmpv[i]; ++i) {
         if (0 == strcmp(tmpv[i], "orted")) {
             loc = i;
         }
         opal_argv_append(argc, argv, tmpv[i]);
     }
     opal_argv_free(tmpv);

     return loc;
}

Thus, we automatically deal with orte_launch_agent just in case it is
a multi-word command.

The only things that go through your code loop are the non-orte-launch-
agent params - and they get messed up.

We added this code specifically because Tim P and I wanted to pass
multi-word orted cmds. It was (and is) the only case where this is of
any use today.

Ralph

On Jun 24, 2009, at 4:05 PM, George Bosilca wrote:

> Just for the sake of it. A funy command line to try:
>
> [bosilca_at_dancer ~]$ mpirun --mca routed_base_verbose 0 --leave-
> session-attached -np 1 --mca orte_launch_agent "orted --mca
> routed_base_verbose 1" uptime
>
> [node03:22355] [[14661,0],1] routed_linear: init routes for daemon
> job [14661,0]
> hnp_uri 960823296.0;tcp://192.168.1.254:58135;tcp://192.168.0.2:58135
> 18:02:59 up 26 days, 17:41, 0 users, load average: 0.97, 0.50, 0.53
> [bosilca_at_dancer ~]$ [node03:22355] [[14661,0],1]
> routed_linear_get([[14661,0],0]) --> [[14661,0],0]
> [node03:22355] [[14661,0],1] routed_linear: init routes for daemon
> job [14661,0]
> hnp_uri 960823296.0;tcp://192.168.1.254:58135;tcp://192.168.0.2:58135
> [node03:22355] [[14661,0],1] routed_linear_get([[14661,0],0]) -->
> [[14661,0],0]
> [node03:22355] [[14661,0],1] routed_linear_get([[14661,0],0]) -->
> [[14661,0],0]
> [node03:22355] [[14661,0],1] routed_linear_get([[14661,0],0]) -->
> [[14661,0],0]
>
> This set the routed_base_verbose to zero for the HNP, and to 1 for
> everybody else. As you can see from the output the orted output
> routed information which means it correctly interpreted the
> multiword argument.
>
> george.
>
> On Jun 24, 2009, at 17:52 , George Bosilca wrote:
>
>>
>> On Jun 24, 2009, at 17:41 , Jeff Squyres wrote:
>>
>>> -----
>>> [14:38] svbu-mpi:~/svn/ompi/orte % mpirun --mca plm_base_verbose
>>> 100 --leave-session-attached -np 1 --mca orte_launch_agent "$bogus/
>>> bin/orted -s" uptime
>>> ...lots of output...
>>> srun --nodes=1 --ntasks=1 --kill-on-bad-exit --nodelist=svbu-
>>> mpi062 /home/jsquyres/bogus/bin/orted -s -mca ess slurm -mca
>>> orte_ess_jobid 3195142144 -mca orte_ess_vpid 1 -mca
>>> orte_ess_num_procs 2 --hnp-uri "3195142144.0;tcp://
>>> 172.29.218.140:34489;tcp://10.10.20.250:34489;tcp://
>>> 10.10.30.250:34489;tcp://192.168.183.1:34489;tcp://
>>> 192.168.184.1:34489" -mca orte_nodelist svbu-mpi062 --mca
>>> plm_base_verbose 100 --mca orte_launch_agent "/home/jsquyres/bogus/
>>> bin/orted -s"
>>> ...
>>> -----
>>>
>>> and it hangs, because the argv[0]
>>>
>>> "/home/jsquyres/bogus/bin/orted -s"
>>>
>>> (including the quotes!) cannot be exec'ed.
>>
>> OK so maybe the -s option was a bad example (it's the one I use
>> regularly). It block the orted, you will have to log on each node,
>> attach with gdb to the orted, and release them by doing a "set
>> orted_spin_flag=0".
>>
>> george.
>>
>>>
>>>
>>>
>>>
>>> On Jun 24, 2009, at 5:15 PM, George Bosilca wrote:
>>>
>>>> I can't guarantee this for all PLM but I can confirm that rsh and
>>>> slurm (1.3.12) works well with this.
>>>>
>>>> We try with and without Open MPI, and the outcome is the same.
>>>>
>>>> [bosilca_at_dancer c]$ srun -n 4 echo "1 2 3 4 5 it works"
>>>> 1 2 3 4 5 it works
>>>> 1 2 3 4 5 it works
>>>> 1 2 3 4 5 it works
>>>> 1 2 3 4 5 it works
>>>>
>>>> [bosilca_at_dancer c]$ srun -N 2 -c 2 mpirun --mca plm slurm --mca
>>>> orte_launch_agent "orted -s" --mca plm_rsh_tree_spawn 1 --bynode
>>>> --mca
>>>> pml ob1 --mca orte_daemon_spin 0 ./hello
>>>> Hello, world, I am 0 of 2 on node03
>>>> Hello, world, I am 1 of 2 on node04
>>>>
>>>> *after releasing the orted from their spin.
>>>>
>>>> In fact what I find strange is the old behavior. Dropping arguments
>>>> without even letting the user know about it, is certainly not a
>>>> desirable approach.
>>>>
>>>> george.
>>>>
>>>> On Jun 24, 2009, at 16:15 , Ralph Castain wrote:
>>>>
>>>> > Yo George
>>>> >
>>>> > This commit is going to break non-rsh launchers. While it is true
>>>> > that the rsh launcher may handle multi-word options by putting
>>>> them
>>>> > in quotes, we specifically avoided it here because it breaks
>>>> SLURM,
>>>> > Torque, and others.
>>>> >
>>>> > This is why we specifically put the inclusion of multi-word
>>>> options
>>>> > in the rsh plm module, and not here. Would you please move it
>>>> back
>>>> > there?
>>>> >
>>>> > Thanks
>>>> > Ralph
>>>> >
>>>> >
>>>> > On Wed, Jun 24, 2009 at 1:51 PM, <bosilca_at_[hidden]> wrote:
>>>> > Author: bosilca
>>>> > Date: 2009-06-24 15:51:52 EDT (Wed, 24 Jun 2009)
>>>> > New Revision: 21513
>>>> > URL: https://svn.open-mpi.org/trac/ompi/changeset/21513
>>>> >
>>>> > Log:
>>>> > When we get a report from an orted about its state, don't use the
>>>> > sender of
>>>> > the message to update the structures, but instead use the
>>>> > information from
>>>> > the URI. The reason is that even the launch report messages can
>>>> get
>>>> > routed.
>>>> >
>>>> > Deal with the orted_cmd_line in a single location.
>>>> >
>>>> > Text files modified:
>>>> > trunk/orte/mca/plm/base/plm_base_launch_support.c | 69 ++++
>>>> +++++
>>>> > ++++++++++++++----------------
>>>> > 1 files changed, 41 insertions(+), 28 deletions(-)
>>>> >
>>>> > Modified: trunk/orte/mca/plm/base/plm_base_launch_support.c
>>>> > =
>>>> > =
>>>> > =
>>>> > =
>>>> > =
>>>> > =
>>>> > =
>>>> > =
>>>> >
>>>> =
>>>> =
>>>> =
>>>> ===================================================================
>>>> > --- trunk/orte/mca/plm/base/plm_base_launch_support.c
>>>> (original)
>>>> > +++ trunk/orte/mca/plm/base/plm_base_launch_support.c
>>>> 2009-06-24
>>>> > 15:51:52 EDT (Wed, 24 Jun 2009)
>>>> > @@ -433,7 +433,8 @@
>>>> > {
>>>> > orte_message_event_t *mev = (orte_message_event_t*)data;
>>>> > opal_buffer_t *buffer = mev->buffer;
>>>> > - char *rml_uri;
>>>> > + orte_process_name_t peer;
>>>> > + char *rml_uri = NULL;
>>>> > int rc, idx;
>>>> > int32_t arch;
>>>> > orte_node_t **nodes;
>>>> > @@ -442,19 +443,11 @@
>>>> > int64_t setupsec, setupusec;
>>>> > int64_t startsec, startusec;
>>>> >
>>>> > - OPAL_OUTPUT_VERBOSE((5, orte_plm_globals.output,
>>>> > - "%s plm:base:orted_report_launch from
>>>> > daemon %s",
>>>> > - ORTE_NAME_PRINT(ORTE_PROC_MY_NAME),
>>>> > - ORTE_NAME_PRINT(&mev->sender)));
>>>> > -
>>>> > /* see if we need to timestamp this receipt */
>>>> > if (orte_timing) {
>>>> > gettimeofday(&recvtime, NULL);
>>>> > }
>>>> >
>>>> > - /* update state */
>>>> > - pdatorted[mev->sender.vpid]->state =
>>>> ORTE_PROC_STATE_RUNNING;
>>>> > -
>>>> > /* unpack its contact info */
>>>> > idx = 1;
>>>> > if (ORTE_SUCCESS != (rc = opal_dss.unpack(buffer, &rml_uri,
>>>> > &idx, OPAL_STRING))) {
>>>> > @@ -466,13 +459,26 @@
>>>> > /* set the contact info into the hash table */
>>>> > if (ORTE_SUCCESS != (rc =
>>>> orte_rml.set_contact_info(rml_uri))) {
>>>> > ORTE_ERROR_LOG(rc);
>>>> > - free(rml_uri);
>>>> > orted_failed_launch = true;
>>>> > goto CLEANUP;
>>>> > }
>>>> > - /* lookup and record this daemon's contact info */
>>>> > - pdatorted[mev->sender.vpid]->rml_uri = strdup(rml_uri);
>>>> > - free(rml_uri);
>>>> > +
>>>> > + rc = orte_rml_base_parse_uris(rml_uri, &peer, NULL );
>>>> > + if( ORTE_SUCCESS != rc ) {
>>>> > + ORTE_ERROR_LOG(rc);
>>>> > + orted_failed_launch = true;
>>>> > + goto CLEANUP;
>>>> > + }
>>>> > +
>>>> > + OPAL_OUTPUT_VERBOSE((5, orte_plm_globals.output,
>>>> > + "%s plm:base:orted_report_launch from
>>>> > daemon %s via %s",
>>>> > + ORTE_NAME_PRINT(ORTE_PROC_MY_NAME),
>>>> > + ORTE_NAME_PRINT(&peer),
>>>> > + ORTE_NAME_PRINT(&mev->sender)));
>>>> > +
>>>> > + /* update state and record for this daemon contact info */
>>>> > + pdatorted[peer.vpid]->state = ORTE_PROC_STATE_RUNNING;
>>>> > + pdatorted[peer.vpid]->rml_uri = rml_uri;
>>>> >
>>>> > /* get the remote arch */
>>>> > idx = 1;
>>>> > @@ -555,31 +561,33 @@
>>>> >
>>>> > /* lookup the node */
>>>> > nodes = (orte_node_t**)orte_node_pool->addr;
>>>> > - if (NULL == nodes[mev->sender.vpid]) {
>>>> > + if (NULL == nodes[peer.vpid]) {
>>>> > ORTE_ERROR_LOG(ORTE_ERR_NOT_FOUND);
>>>> > orted_failed_launch = true;
>>>> > goto CLEANUP;
>>>> > }
>>>> > /* store the arch */
>>>> > - nodes[mev->sender.vpid]->arch = arch;
>>>> > + nodes[peer.vpid]->arch = arch;
>>>> >
>>>> > /* if a tree-launch is underway, send the cmd back */
>>>> > if (NULL != orte_tree_launch_cmd) {
>>>> > - orte_rml.send_buffer(&mev->sender, orte_tree_launch_cmd,
>>>> > ORTE_RML_TAG_DAEMON, 0);
>>>> > + orte_rml.send_buffer(&peer, orte_tree_launch_cmd,
>>>> > ORTE_RML_TAG_DAEMON, 0);
>>>> > }
>>>> >
>>>> > CLEANUP:
>>>> >
>>>> > OPAL_OUTPUT_VERBOSE((5, orte_plm_globals.output,
>>>> > - "%s plm:base:orted_report_launch %s for
>>>> > daemon %s at contact %s",
>>>> > + "%s plm:base:orted_report_launch %s for
>>>> > daemon %s (via %s) at contact %s",
>>>> > ORTE_NAME_PRINT(ORTE_PROC_MY_NAME),
>>>> > orted_failed_launch ? "failed" :
>>>> "completed",
>>>> > - ORTE_NAME_PRINT(&mev->sender),
>>>> > pdatorted[mev->sender.vpid]->rml_uri));
>>>> > + ORTE_NAME_PRINT(&peer),
>>>> > + ORTE_NAME_PRINT(&mev->sender),
>>>> > pdatorted[peer.vpid]->rml_uri));
>>>> >
>>>> > /* release the message */
>>>> > OBJ_RELEASE(mev);
>>>> >
>>>> > if (orted_failed_launch) {
>>>> > + if( NULL != rml_uri ) free(rml_uri);
>>>> > orte_errmgr.incomplete_start(ORTE_PROC_MY_NAME->jobid,
>>>> > ORTE_ERROR_DEFAULT_EXIT_CODE);
>>>> > } else {
>>>> > orted_num_callback++;
>>>> > @@ -1133,18 +1141,23 @@
>>>> > * being sure to "purge" any that would cause problems
>>>> > * on backend nodes
>>>> > */
>>>> > - if (ORTE_PROC_IS_HNP) {
>>>> > + if (ORTE_PROC_IS_HNP || ORTE_PROC_IS_DAEMON) {
>>>> > cnt = opal_argv_count(orted_cmd_line);
>>>> > for (i=0; i < cnt; i+=3) {
>>>> > - /* if the specified option is more than one word, we
>>>> > don't
>>>> > - * have a generic way of passing it as some
>>>> > environments ignore
>>>> > - * any quotes we add, while others don't - so we
>>>> ignore
>>>> > any
>>>> > - * such options. In most cases, this won't be a
>>>> problem
>>>> > as
>>>> > - * they typically only apply to things of interest
>>>> to
>>>> > the HNP.
>>>> > - * Individual environments can add these back into
>>>> the
>>>> > cmd line
>>>> > - * as they know if it can be supported
>>>> > - */
>>>> > - if (NULL != strchr(orted_cmd_line[i+2], ' ')) {
>>>> > + /* in the rsh environment, we can append multi-word
>>>> > arguments
>>>> > + * by enclosing them in quotes. Check for any
>>>> multi-word
>>>> > + * mca params passed to mpirun and include them
>>>> > + */
>>>> > + if (NULL != strchr(orted_cmd_line[i+2], ' ')) {
>>>> > + char* param;
>>>> > +
>>>> > + /* must add quotes around it */
>>>> > + asprintf(&param, "\"%s\"", orted_cmd_line[i+2]);
>>>> > + /* now pass it along */
>>>> > + opal_argv_append(argc, argv, orted_cmd_line[i]);
>>>> > + opal_argv_append(argc, argv, orted_cmd_line[i
>>>> +1]);
>>>> > + opal_argv_append(argc, argv, param);
>>>> > + free(param);
>>>> > continue;
>>>> > }
>>>> > /* The daemon will attempt to open the PLM on the
>>>> remote
>>>> > _______________________________________________
>>>> > svn mailing list
>>>> > svn_at_[hidden]
>>>> > http://www.open-mpi.org/mailman/listinfo.cgi/svn
>>>> >
>>>> > _______________________________________________
>>>> > devel mailing list
>>>> > devel_at_[hidden]
>>>> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>
>>>
>>>
>>> --
>>> Jeff Squyres
>>> Cisco Systems
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel