Just to be specific, here is how we handle the orte_launch_agent in rsh that makes it work:

    /* now get the orted cmd - as specified by user - into our tmp array.
     * The function returns the location where the actual orted command is
     * located - usually in the final spot, but someone could
     * have added options. For example, it should be legal for them to use
     * "orted --debug-devel" so they get debug output from the orteds, but
     * not from mpirun. Also, they may have a customized version of orted
     * that takes arguments in addition to the std ones we already support
     */
    orted_argc = 0;
    orted_argv = NULL;
    orted_index = orte_plm_base_setup_orted_cmd(&orted_argc, &orted_argv);

    

    /* look at the returned orted cmd argv to check several cases:
     *
     * - only "orted" was given. This is the default and thus most common
     *   case. In this situation, there is nothing we need to do
     *
     * - something was given that doesn't include "orted" - i.e., someone
     *   has substituted their own daemon. There isn't anything we can
     *   do here, so we want to avoid adding prefixes to the cmd
     *
     * - something was given that precedes "orted". For example, someone
     *   may have specified "valgrind [options] orted". In this case, we
     *   need to separate out that "orted_prefix" section so it can be
     *   treated separately below
     *
     * - something was given that follows "orted". An example was given above.
     *   In this case, we need to construct the effective "orted_cmd" so it
     *   can be treated properly below
     *
     * Obviously, the latter two cases can be combined - just to make it
     * even more interesting! Gotta love rsh/ssh...
     */
    if (0 == orted_index) {
        /* this is the default scenario, but there could be options specified
         * so we need to account for that possibility
         */
        orted_cmd = opal_argv_join(orted_argv, ' ');
        orted_prefix = NULL;
    } else if (0 > orted_index) {
        /* no "orted" was included */
        orted_cmd = NULL;
        orted_prefix = opal_argv_join(orted_argv, ' ');
    } else {
        /* okay, so the "orted" cmd is somewhere in this array, with
         * something preceding it and perhaps things following it.
         */
        orted_prefix = opal_argv_join_range(orted_argv, 0, orted_index, ' ');
        orted_cmd = opal_argv_join_range(orted_argv, orted_index, opal_argv_count(orted_argv), ' ');
    }
    opal_argv_free(orted_argv);  /* done with this */

    

    /* we now need to assemble the actual cmd that will be executed - this depends
     * upon whether or not a prefix directory is being used
     */


As noted in prior email:

int orte_plm_base_setup_orted_cmd(int *argc, char ***argv)
{
    int i, loc;
    char **tmpv;

    

    /* set default location */
    loc = -1;
    /* split the command apart in case it is multi-word */
    tmpv = opal_argv_split(orte_launch_agent, ' ');
    for (i = 0; NULL != tmpv && NULL != tmpv[i]; ++i) {
        if (0 == strcmp(tmpv[i], "orted")) {
            loc = i;
        }
        opal_argv_append(argc, argv, tmpv[i]);
    }
    opal_argv_free(tmpv);

    

    return loc;
}

So as you can see, we deliberately split the cmd apart and reassemble it to allow for any variation of the orted cmd you might like to use. This was done because we can't support it in all environments in a generic sense - every variant we did failed in at least one environment, with either not enough quotes or too many.

We didn't do this just for the heck of it. Several of us spent a bunch of time testing all environments, trying to find a way to support this capability. After a lot of pain, we finally developed this method that has been working for well over a year.

I really would rather not waste a lot of my time re-visiting this rather lengthy demonstration/argument cycle again. For the purposes of your tree spawn, the existing capability (prior to your commit) should meet all requirements. You may have to do some work to ensure that the child daemons properly flow through the provided code, but you most certainly don't need the change made to the base functions.

So why don't we revert just that piece out for now so it quits breaking existing functionality? You will find similar code already exists in the rsh launcher anyway - see lines 673 and following. All you have to do is enable those lines for daemons as well as the HNP so that the params get passed to your tree children.

We can then continue this argument at leisure while you take us through all the prior attempts and show how we were wrong.

I would just rather not derail everything I'm doing to go through this yet again - especially when it isn't necessary.

Thanks
Ralph


On Jun 24, 2009, at 4:05 PM, George Bosilca wrote:

Just for the sake of it. A funy command line to try:

[bosilca@dancer ~]$ mpirun --mca routed_base_verbose 0 --leave-session-attached -np 1 --mca orte_launch_agent "orted --mca routed_base_verbose 1" uptime

[node03:22355] [[14661,0],1] routed_linear: init routes for daemon job [14661,0]
hnp_uri 960823296.0;tcp://192.168.1.254:58135;tcp://192.168.0.2:58135
18:02:59 up 26 days, 17:41,  0 users,  load average: 0.97, 0.50, 0.53
[bosilca@dancer ~]$ [node03:22355] [[14661,0],1] routed_linear_get([[14661,0],0]) --> [[14661,0],0]
[node03:22355] [[14661,0],1] routed_linear: init routes for daemon job [14661,0]
hnp_uri 960823296.0;tcp://192.168.1.254:58135;tcp://192.168.0.2:58135
[node03:22355] [[14661,0],1] routed_linear_get([[14661,0],0]) --> [[14661,0],0]
[node03:22355] [[14661,0],1] routed_linear_get([[14661,0],0]) --> [[14661,0],0]
[node03:22355] [[14661,0],1] routed_linear_get([[14661,0],0]) --> [[14661,0],0]

This set the routed_base_verbose to zero for the HNP, and to 1 for everybody else. As you can see from the output the orted output routed information which means it correctly interpreted the multiword argument.

 george.

On Jun 24, 2009, at 17:52 , George Bosilca wrote:


On Jun 24, 2009, at 17:41 , Jeff Squyres wrote:

-----
[14:38] svbu-mpi:~/svn/ompi/orte % mpirun --mca plm_base_verbose 100 --leave-session-attached -np 1 --mca orte_launch_agent "$bogus/bin/orted -s" uptime
...lots of output...
srun --nodes=1 --ntasks=1 --kill-on-bad-exit --nodelist=svbu-mpi062 /home/jsquyres/bogus/bin/orted -s -mca ess slurm -mca orte_ess_jobid 3195142144 -mca orte_ess_vpid 1 -mca orte_ess_num_procs 2 --hnp-uri "3195142144.0;tcp://172.29.218.140:34489;tcp://10.10.20.250:34489;tcp://10.10.30.250:34489;tcp://192.168.183.1:34489;tcp://192.168.184.1:34489" -mca orte_nodelist svbu-mpi062 --mca plm_base_verbose 100 --mca orte_launch_agent "/home/jsquyres/bogus/bin/orted -s"
...
-----

and it hangs, because the argv[0]

"/home/jsquyres/bogus/bin/orted -s"

(including the quotes!) cannot be exec'ed.

OK so maybe the -s option was a bad example (it's the one I use regularly). It block the orted, you will have to log on each node, attach with gdb to the orted, and release them by doing a "set orted_spin_flag=0".

george.





On Jun 24, 2009, at 5:15 PM, George Bosilca wrote:

I can't guarantee this for all PLM but I can confirm that rsh and
slurm (1.3.12) works well with this.

We try with and without Open MPI, and the outcome is the same.

[bosilca@dancer c]$ srun -n 4 echo "1 2 3 4 5 it works"
1 2 3 4 5 it works
1 2 3 4 5 it works
1 2 3 4 5 it works
1 2 3 4 5 it works

[bosilca@dancer c]$ srun -N 2 -c 2 mpirun --mca plm slurm --mca
orte_launch_agent "orted -s" --mca plm_rsh_tree_spawn 1 --bynode --mca
pml ob1 --mca orte_daemon_spin 0 ./hello
Hello, world, I am 0 of 2 on node03
Hello, world, I am 1 of 2 on node04

*after releasing the orted from their spin.

In fact what I find strange is the old behavior. Dropping arguments
without even letting the user know about it, is certainly not a
desirable approach.

george.

On Jun 24, 2009, at 16:15 , Ralph Castain wrote:

> Yo George
>
> This commit is going to break non-rsh launchers. While it is true
> that the rsh launcher may handle multi-word options by putting them
> in quotes, we specifically avoided it here because it breaks SLURM,
> Torque, and others.
>
> This is why we specifically put the inclusion of multi-word options
> in the rsh plm module, and not here. Would you please move it back
> there?
>
> Thanks
> Ralph
>
>
> On Wed, Jun 24, 2009 at 1:51 PM, <bosilca@osl.iu.edu> wrote:
> Author: bosilca
> Date: 2009-06-24 15:51:52 EDT (Wed, 24 Jun 2009)
> New Revision: 21513
> URL: https://svn.open-mpi.org/trac/ompi/changeset/21513
>
> Log:
> When we get a report from an orted about its state, don't use the
> sender of
> the message to update the structures, but instead use the
> information from
> the URI. The reason is that even the launch report messages can get
> routed.
>
> Deal with the orted_cmd_line in a single location.
>
> Text files modified:
>   trunk/orte/mca/plm/base/plm_base_launch_support.c |    69 +++++++++
> ++++++++++++++----------------
>   1 files changed, 41 insertions(+), 28 deletions(-)
>
> Modified: trunk/orte/mca/plm/base/plm_base_launch_support.c
> =
> =
> =
> =
> =
> =
> =
> =
> ======================================================================
> --- trunk/orte/mca/plm/base/plm_base_launch_support.c   (original)
> +++ trunk/orte/mca/plm/base/plm_base_launch_support.c   2009-06-24
> 15:51:52 EDT (Wed, 24 Jun 2009)
> @@ -433,7 +433,8 @@
>  {
>     orte_message_event_t *mev = (orte_message_event_t*)data;
>     opal_buffer_t *buffer = mev->buffer;
> -    char *rml_uri;
> +    orte_process_name_t peer;
> +    char *rml_uri = NULL;
>     int rc, idx;
>     int32_t arch;
>     orte_node_t **nodes;
> @@ -442,19 +443,11 @@
>     int64_t setupsec, setupusec;
>     int64_t startsec, startusec;
>
> -    OPAL_OUTPUT_VERBOSE((5, orte_plm_globals.output,
> -                         "%s plm:base:orted_report_launch from
> daemon %s",
> -                         ORTE_NAME_PRINT(ORTE_PROC_MY_NAME),
> -                         ORTE_NAME_PRINT(&mev->sender)));
> -
>     /* see if we need to timestamp this receipt */
>     if (orte_timing) {
>         gettimeofday(&recvtime, NULL);
>     }
>
> -    /* update state */
> -    pdatorted[mev->sender.vpid]->state = ORTE_PROC_STATE_RUNNING;
> -
>     /* unpack its contact info */
>     idx = 1;
>     if (ORTE_SUCCESS != (rc = opal_dss.unpack(buffer, &rml_uri,
> &idx, OPAL_STRING))) {
> @@ -466,13 +459,26 @@
>     /* set the contact info into the hash table */
>     if (ORTE_SUCCESS != (rc = orte_rml.set_contact_info(rml_uri))) {
>         ORTE_ERROR_LOG(rc);
> -        free(rml_uri);
>         orted_failed_launch = true;
>         goto CLEANUP;
>     }
> -    /* lookup and record this daemon's contact info */
> -    pdatorted[mev->sender.vpid]->rml_uri = strdup(rml_uri);
> -    free(rml_uri);
> +
> +    rc = orte_rml_base_parse_uris(rml_uri, &peer, NULL );
> +    if( ORTE_SUCCESS != rc ) {
> +        ORTE_ERROR_LOG(rc);
> +        orted_failed_launch = true;
> +        goto CLEANUP;
> +    }
> +
> +    OPAL_OUTPUT_VERBOSE((5, orte_plm_globals.output,
> +                         "%s plm:base:orted_report_launch from
> daemon %s via %s",
> +                         ORTE_NAME_PRINT(ORTE_PROC_MY_NAME),
> +                         ORTE_NAME_PRINT(&peer),
> +                         ORTE_NAME_PRINT(&mev->sender)));
> +
> +    /* update state and record for this daemon contact info */
> +    pdatorted[peer.vpid]->state = ORTE_PROC_STATE_RUNNING;
> +    pdatorted[peer.vpid]->rml_uri = rml_uri;
>
>     /* get the remote arch */
>     idx = 1;
> @@ -555,31 +561,33 @@
>
>     /* lookup the node */
>     nodes = (orte_node_t**)orte_node_pool->addr;
> -    if (NULL == nodes[mev->sender.vpid]) {
> +    if (NULL == nodes[peer.vpid]) {
>         ORTE_ERROR_LOG(ORTE_ERR_NOT_FOUND);
>         orted_failed_launch = true;
>         goto CLEANUP;
>     }
>     /* store the arch */
> -    nodes[mev->sender.vpid]->arch = arch;
> +    nodes[peer.vpid]->arch = arch;
>
>     /* if a tree-launch is underway, send the cmd back */
>     if (NULL != orte_tree_launch_cmd) {
> -        orte_rml.send_buffer(&mev->sender, orte_tree_launch_cmd,
> ORTE_RML_TAG_DAEMON, 0);
> +        orte_rml.send_buffer(&peer, orte_tree_launch_cmd,
> ORTE_RML_TAG_DAEMON, 0);
>     }
>
>  CLEANUP:
>
>     OPAL_OUTPUT_VERBOSE((5, orte_plm_globals.output,
> -                         "%s plm:base:orted_report_launch %s for
> daemon %s at contact %s",
> +                         "%s plm:base:orted_report_launch %s for
> daemon %s (via %s) at contact %s",
>                          ORTE_NAME_PRINT(ORTE_PROC_MY_NAME),
>                          orted_failed_launch ? "failed" : "completed",
> -                         ORTE_NAME_PRINT(&mev->sender),
> pdatorted[mev->sender.vpid]->rml_uri));
> +                         ORTE_NAME_PRINT(&peer),
> +                         ORTE_NAME_PRINT(&mev->sender),
> pdatorted[peer.vpid]->rml_uri));
>
>     /* release the message */
>     OBJ_RELEASE(mev);
>
>     if (orted_failed_launch) {
> +        if( NULL != rml_uri ) free(rml_uri);
>         orte_errmgr.incomplete_start(ORTE_PROC_MY_NAME->jobid,
> ORTE_ERROR_DEFAULT_EXIT_CODE);
>     } else {
>         orted_num_callback++;
> @@ -1133,18 +1141,23 @@
>      * being sure to "purge" any that would cause problems
>      * on backend nodes
>      */
> -    if (ORTE_PROC_IS_HNP) {
> +    if (ORTE_PROC_IS_HNP || ORTE_PROC_IS_DAEMON) {
>         cnt = opal_argv_count(orted_cmd_line);
>         for (i=0; i < cnt; i+=3) {
> -            /* if the specified option is more than one word, we
> don't
> -             * have a generic way of passing it as some
> environments ignore
> -             * any quotes we add, while others don't - so we ignore
> any
> -             * such options. In most cases, this won't be a problem
> as
> -             * they typically only apply to things of interest to
> the HNP.
> -             * Individual environments can add these back into the
> cmd line
> -             * as they know if it can be supported
> -             */
> -            if (NULL != strchr(orted_cmd_line[i+2], ' ')) {
> +             /* in the rsh environment, we can append multi-word
> arguments
> +              * by enclosing them in quotes. Check for any multi-word
> +              * mca params passed to mpirun and include them
> +              */
> +             if (NULL != strchr(orted_cmd_line[i+2], ' ')) {
> +                char* param;
> +
> +                /* must add quotes around it */
> +                asprintf(&param, "\"%s\"", orted_cmd_line[i+2]);
> +                /* now pass it along */
> +                opal_argv_append(argc, argv, orted_cmd_line[i]);
> +                opal_argv_append(argc, argv, orted_cmd_line[i+1]);
> +                opal_argv_append(argc, argv, param);
> +                free(param);
>                 continue;
>             }
>             /* The daemon will attempt to open the PLM on the remote
> _______________________________________________
> svn mailing list
> svn@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/svn
>
> _______________________________________________
> devel mailing list
> devel@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

_______________________________________________
devel mailing list
devel@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Jeff Squyres
Cisco Systems

_______________________________________________
devel mailing list
devel@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

_______________________________________________
devel mailing list
devel@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

_______________________________________________
devel mailing list
devel@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel