Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] OpenMPI and OAR issues
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2008-11-06 15:36:17


OMPI assumes (for faster startup) that your local shell is the same as
your remote shell. If that's not the case, try setting
pls_rsh_assume_same_shell to 0.

On Nov 6, 2008, at 3:31 PM, George Bosilca wrote:

> OAR is the batch scheduler used on the Grid5K platform. As far as I
> know, set is a basic shell internal command, and it is understood by
> all shells. The problem here seems to be that somehow we're using
> bash, but with a tcsh shell code (because setenv is definitively not
> something that bash understand).
>
> george.
>
> On Nov 6, 2008, at 3:07 PM, Ralph Castain wrote:
>
>> I have no idea what "oar" is, but it looks to me like the rsh
>> launcher is getting confused about the remote shell it will use - I
>> don't believe that the "set" cmd shown below is proper bash syntax,
>> and that is the error that is causing the launch to fail.
>>
>> What remote shell should it fine? I know we don't have any "oar"
>> shell-specific code in the system, but maybe it looks like
>> something else?
>>
>> On Nov 6, 2008, at 12:55 PM, Andrea Pellegrini wrote:
>>
>>> Hi all,
>>> I'm trying to run an openmpi application on a oar cluster. I think
>>> the cluster is configured correctly but I still have problems when
>>> I run mpirun:
>>>
>>> apellegr_at_m45-037:~$ mpirun -prefix /n/poolfs/z/home/apellegr/
>>> openmpi -machinefile $OAR_FILE_NODES -mca pls_rsh_agent "oarsh" -
>>> np 10 /n/poolfs/z/home/apellegr/mpi_test/hello_world.x86 bash: -c:
>>> line 0: syntax error near unexpected token `('
>>> bash: -c: line 0: ` set path = ( /n/poolfs/z/home/apellegr/openmpi/
>>> bin $path ) ; if ( $?LD_LIBRARY_PATH == 1 ) set OMPI_have_llp ; if
>>> ( $?LD_LIBRARY_PATH == 0 ) setenv LD_LIBRARY_PATH /n/poolfs/z/home/
>>> apellegr/openmpi/lib ; if ( $?OMPI_have_llp == 1 ) setenv
>>> LD_LIBRARY_PATH /n/poolfs/z/home/apellegr/openmpi/lib:
>>> $LD_LIBRARY_PATH ; /n/poolfs/z/home/apellegr/openmpi/bin/orted --
>>> bootproxy 1 --name 0.0.4 --num_procs 5 --vpid_start 0 --nodename
>>> m45-040.pool --universe apellegr_at_m45-037.pool:default-
>>> universe-29482 --nsreplica "0.0.0;tcp://10.11.45.37:36790" --
>>> gprreplica "0.0.0;tcp://10.11.45.37:36790"'
>>> bash: -c: line 0: syntax error near unexpected token `('
>>> bash: -c: line 0: ` set path = ( /n/poolfs/z/home/apellegr/openmpi/
>>> bin $path ) ; if ( $?LD_LIBRARY_PATH == 1 ) set OMPI_have_llp ; if
>>> ( $?LD_LIBRARY_PATH == 0 ) setenv LD_LIBRARY_PATH /n/poolfs/z/home/
>>> apellegr/openmpi/lib ; if ( $?OMPI_have_llp == 1 ) setenv
>>> LD_LIBRARY_PATH /n/poolfs/z/home/apellegr/openmpi/lib:
>>> $LD_LIBRARY_PATH ; /n/poolfs/z/home/apellegr/openmpi/bin/orted --
>>> bootproxy 1 --name 0.0.2 --num_procs 5 --vpid_start 0 --nodename
>>> m45-038.pool --universe apellegr_at_m45-037.pool:default-
>>> universe-29482 --nsreplica "0.0.0;tcp://10.11.45.37:36790" --
>>> gprreplica "0.0.0;tcp://10.11.45.37:36790"'
>>> [m45-037.pool:29482] ERROR: A daemon on node m45-038.pool failed
>>> to start as expected.
>>> [m45-037.pool:29482] ERROR: There may be more information
>>> available from
>>> [m45-037.pool:29482] ERROR: the remote shell (see above).
>>> [m45-037.pool:29482] ERROR: The daemon exited unexpectedly with
>>> status 2.
>>> bash: -c: line 0: syntax error near unexpected token `('
>>> bash: -c: line 0: ` set path = ( /n/poolfs/z/home/apellegr/openmpi/
>>> bin $path ) ; if ( $?LD_LIBRARY_PATH == 1 ) set OMPI_have_llp ; if
>>> ( $?LD_LIBRARY_PATH == 0 ) setenv LD_LIBRARY_PATH /n/poolfs/z/home/
>>> apellegr/openmpi/lib ; if ( $?OMPI_have_llp == 1 ) setenv
>>> LD_LIBRARY_PATH /n/poolfs/z/home/apellegr/openmpi/lib:
>>> $LD_LIBRARY_PATH ; /n/poolfs/z/home/apellegr/openmpi/bin/orted --
>>> bootproxy 1 --name 0.0.3 --num_procs 5 --vpid_start 0 --nodename
>>> m45-039.pool --universe apellegr_at_m45-037.pool:default-
>>> universe-29482 --nsreplica "0.0.0;tcp://10.11.45.37:36790" --
>>> gprreplica "0.0.0;tcp://10.11.45.37:36790"'
>>> [m45-037.pool:29482] ERROR: A daemon on node m45-039.pool failed
>>> to start as expected.
>>> [m45-037.pool:29482] ERROR: There may be more information
>>> available from
>>> [m45-037.pool:29482] ERROR: the remote shell (see above).
>>> [m45-037.pool:29482] ERROR: The daemon exited unexpectedly with
>>> status 2.
>>> [m45-037.pool:29482] [0,0,0] ORTE_ERROR_LOG: Timeout in
>>> file ../../../../orte/mca/pls/base/pls_base_orted_cmds.c at line 275
>>> [m45-037.pool:29482] [0,0,0] ORTE_ERROR_LOG: Timeout in
>>> file ../../../../../orte/mca/pls/rsh/pls_rsh_module.c at line 1158
>>> [m45-037.pool:29482] [0,0,0] ORTE_ERROR_LOG: Timeout in
>>> file ../../../../../orte/mca/errmgr/hnp/errmgr_hnp.c at line 90
>>> [m45-037.pool:29482] ERROR: A daemon on node m45-040.pool failed
>>> to start as expected.
>>> [m45-037.pool:29482] ERROR: There may be more information
>>> available from
>>> [m45-037.pool:29482] ERROR: the remote shell (see above).
>>> [m45-037.pool:29482] ERROR: The daemon exited unexpectedly with
>>> status 2.
>>> [m45-037.pool:29482] [0,0,0] ORTE_ERROR_LOG: Timeout in
>>> file ../../../../orte/mca/pls/base/pls_base_orted_cmds.c at line 188
>>> [m45-037.pool:29482] [0,0,0] ORTE_ERROR_LOG: Timeout in
>>> file ../../../../../orte/mca/pls/rsh/pls_rsh_module.c at line 1190
>>> --------------------------------------------------------------------------
>>> mpirun was unable to cleanly terminate the daemons for this job.
>>> Returned value Timeout instead of ORTE_SUCCESS.
>>> --------------------------------------------------------------------------
>>> apellegr_at_m45-037:~$
>>>
>>>
>>> If I run it with the option "-mca pls_rsh_debug 1" I get:
>>>
>>> apellegr_at_m45-037:~$ mpirun -prefix /n/poolfs/z/home/apellegr/
>>> openmpi -machinefile $OAR_FILE_NODES -mca pls_rsh_debug 1 -mca
>>> pls_rsh_agent "oarsh" -np 10 /n/poolfs/z/home/apellegr/mpi_test/
>>> hello_world.x86
>>> [m45-037.pool:29473] pls:rsh: local shell: 2 (tcsh)
>>> [m45-037.pool:29473] pls:rsh: assuming same remote shell as local
>>> shell
>>> [m45-037.pool:29473] pls:rsh: remote shell: 2 (tcsh)
>>> [m45-037.pool:29473] pls:rsh: final template argv:
>>> [m45-037.pool:29473] pls:rsh: /usr/bin/oarsh <template> orted
>>> --bootproxy 1 --name <template> --num_procs 5 --vpid_start 0 --
>>> nodename <template> --universe apellegr_at_m45-037.pool:default-
>>> universe-29473 --nsreplica "0.0.0;tcp://10.11.45.37:55477" --
>>> gprreplica "0.0.0;tcp://10.11.45.37:55477"
>>> [m45-037.pool:29473] pls:rsh: launching on node m45-037.pool
>>> [m45-037.pool:29473] pls:rsh: m45-037.pool is a LOCAL node
>>> [m45-037.pool:29473] pls:rsh: reset PATH: /n/poolfs/z/home/
>>> apellegr/openmpi/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/
>>> bin:/sbin:/bin:/usr/X11R6/bin:/n/poolfs/z/home/apellegr/openmpi/
>>> bin:/n/poolfs/z/home/apellegr/openssl/bin
>>> [m45-037.pool:29473] pls:rsh: reset LD_LIBRARY_PATH: /n/poolfs/z/
>>> home/apellegr/openmpi/lib
>>> [m45-037.pool:29473] pls:rsh: changing to directory /home/apellegr
>>> [m45-037.pool:29473] pls:rsh: executing: (/n/poolfs/z/home/
>>> apellegr/openmpi/bin/orted) orted --bootproxy 1 --name 0.0.1 --
>>> num_procs 5 --vpid_start 0 --nodename m45-037.pool --universe apellegr_at_m45-037.pool
>>> :default-universe-29473 --nsreplica "0.0.0;tcp://
>>> 10.11.45.37:55477" --gprreplica "0.0.0;tcp://10.11.45.37:55477" --
>>> set-sid [OAR_JOBID=597856 HOST=m45-037.pool TERM=xterm SHELL=/bin/
>>> tcsh OAR_WORKING_DIRECTORY=/home/apellegr SSH_CLIENT=10.11.0.4
>>> 50481 6667 OAR_USER=apellegr GROUP=csestudents USER=apellegr
>>> SUDO_USER=oar OAR_WORKDIR=/home/apellegr SUDO_UID=30143
>>> HOSTTYPE=i486-linux USERNAME=apellegr OAR_JOB_NAME= OAR_NODE_FILE=/
>>> var/lib/oar/597856 OAR_RESOURCE_PROPERTIES_FILE=/var/lib/oar/
>>> 597856_resources MAIL=/var/mail/oar PATH=/n/poolfs/z/home/apellegr/
>>> openmpi/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/
>>> sbin:/bin:/usr/X11R6/bin:/n/poolfs/z/home/apellegr/openmpi/bin:/n/
>>> poolfs/z/home/apellegr/openssl/bin OAR_PROJECT_NAME=default
>>> OAR_JOB_WALLTIME_SECONDS=7200 PWD=/home/apellegr HOME=/home/
>>> apellegr SUDO_COMMAND=OAR SHLVL=2 OAR_FILE_NODES=/var/lib/oar/
>>> 597856 OSTYPE=linux VENDOR=intel OAR_JOB_WALLTIME=2:0:0
>>> MACHTYPE=i486 LOGNAME=apellegr OAR_NODEFILE=/var/lib/oar/597856
>>> OAR_RESOURCE_FILE=/var/lib/oar/597856 SUDO_GID=390
>>> OAR_JOB_ID=597856 OAR_O_WORKDIR=/home/apellegr _=/n/poolfs/z/home/
>>> apellegr/openmpi/bin/mpirun OLDPWD=/home/apellegr/openmpi
>>> OMPI_MCA_rds_hostfile_path=/var/lib/oar/597856
>>> OMPI_MCA_pls_rsh_debug=1 OMPI_MCA_pls_rsh_agent=oarsh
>>> LD_LIBRARY_PATH=/n/poolfs/z/home/apellegr/openmpi/lib
>>> OMPI_MCA_seed=0]
>>> [m45-037.pool:29473] pls:rsh: launching on node m45-038.pool
>>> [m45-037.pool:29473] pls:rsh: m45-038.pool is a REMOTE node
>>> [m45-037.pool:29473] pls:rsh: executing: (//usr/bin/oarsh) /usr/
>>> bin/oarsh m45-038.pool set path = ( /n/poolfs/z/home/apellegr/
>>> openmpi/bin $path ) ; if ( $?LD_LIBRARY_PATH == 1 ) set
>>> OMPI_have_llp ; if ( $?LD_LIBRARY_PATH == 0 ) setenv
>>> LD_LIBRARY_PATH /n/poolfs/z/home/apellegr/openmpi/lib ; if ( $?
>>> OMPI_have_llp == 1 ) setenv LD_LIBRARY_PATH /n/poolfs/z/home/
>>> apellegr/openmpi/lib:$LD_LIBRARY_PATH ; /n/poolfs/z/home/apellegr/
>>> openmpi/bin/orted --bootproxy 1 --name 0.0.2 --num_procs 5 --
>>> vpid_start 0 --nodename m45-038.pool --universe apellegr_at_m45-037.pool
>>> :default-universe-29473 --nsreplica "0.0.0;tcp://
>>> 10.11.45.37:55477" --gprreplica "0.0.0;tcp://
>>> 10.11.45.37:55477" [OAR_JOBID=597856 HOST=m45-037.pool TERM=xterm
>>> SHELL=/bin/tcsh OAR_WORKING_DIRECTORY=/home/apellegr
>>> SSH_CLIENT=10.11.0.4 50481 6667 OAR_USER=apellegr
>>> GROUP=csestudents USER=apellegr SUDO_USER=oar OAR_WORKDIR=/home/
>>> apellegr SUDO_UID=30143 HOSTTYPE=i486-linux USERNAME=apellegr
>>> OAR_JOB_NAME= OAR_NODE_FILE=/var/lib/oar/597856
>>> OAR_RESOURCE_PROPERTIES_FILE=/var/lib/oar/597856_resources MAIL=/
>>> var/mail/oar PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/
>>> bin:/sbin:/bin:/usr/X11R6/bin:/n/poolfs/z/home/apellegr/openmpi/
>>> bin:/n/poolfs/z/home/apellegr/openssl/bin OAR_PROJECT_NAME=default
>>> OAR_JOB_WALLTIME_SECONDS=7200 PWD=/home/apellegr HOME=/home/
>>> apellegr SUDO_COMMAND=OAR SHLVL=2 OAR_FILE_NODES=/var/lib/oar/
>>> 597856 OSTYPE=linux VENDOR=intel OAR_JOB_WALLTIME=2:0:0
>>> MACHTYPE=i486 LOGNAME=apellegr OAR_NODEFILE=/var/lib/oar/597856
>>> OAR_RESOURCE_FILE=/var/lib/oar/597856 SUDO_GID=390
>>> OAR_JOB_ID=597856 OAR_O_WORKDIR=/home/apellegr _=/n/poolfs/z/home/
>>> apellegr/openmpi/bin/mpirun OLDPWD=/home/apellegr/openmpi
>>> OMPI_MCA_rds_hostfile_path=/var/lib/oar/597856
>>> OMPI_MCA_pls_rsh_debug=1 OMPI_MCA_pls_rsh_agent=oarsh
>>> OMPI_MCA_seed=0]
>>> bash: -c: line 0: syntax error near unexpected token `('
>>> bash: -c: line 0: ` set path = ( /n/poolfs/z/home/apellegr/openmpi/
>>> bin $path ) ; if ( $?LD_LIBRARY_PATH == 1 ) set OMPI_have_llp ; if
>>> ( $?LD_LIBRARY_PATH == 0 ) setenv LD_LIBRARY_PATH /n/poolfs/z/home/
>>> apellegr/openmpi/lib ; if ( $?OMPI_have_llp == 1 ) setenv
>>> LD_LIBRARY_PATH /n/poolfs/z/home/apellegr/openmpi/lib:
>>> $LD_LIBRARY_PATH ; /n/poolfs/z/home/apellegr/openmpi/bin/orted --
>>> bootproxy 1 --name 0.0.2 --num_procs 5 --vpid_start 0 --nodename
>>> m45-038.pool --universe apellegr_at_m45-037.pool:default-
>>> universe-29473 --nsreplica "0.0.0;tcp://10.11.45.37:55477" --
>>> gprreplica "0.0.0;tcp://10.11.45.37:55477"'
>>> [m45-037.pool:29473] pls:rsh: launching on node m45-039.pool
>>> [m45-037.pool:29473] ERROR: A daemon on node m45-038.pool failed
>>> to start as expected.
>>> [m45-037.pool:29473] ERROR: There may be more information
>>> available from
>>> [m45-037.pool:29473] ERROR: the remote shell (see above).
>>> [m45-037.pool:29473] ERROR: The daemon exited unexpectedly with
>>> status 2.
>>> [m45-037.pool:29473] pls:rsh: m45-039.pool is a REMOTE node
>>> [m45-037.pool:29473] pls:rsh: executing: (//usr/bin/oarsh) /usr/
>>> bin/oarsh m45-039.pool set path = ( /n/poolfs/z/home/apellegr/
>>> openmpi/bin $path ) ; if ( $?LD_LIBRARY_PATH == 1 ) set
>>> OMPI_have_llp ; if ( $?LD_LIBRARY_PATH == 0 ) setenv
>>> LD_LIBRARY_PATH /n/poolfs/z/home/apellegr/openmpi/lib ; if ( $?
>>> OMPI_have_llp == 1 ) setenv LD_LIBRARY_PATH /n/poolfs/z/home/
>>> apellegr/openmpi/lib:$LD_LIBRARY_PATH ; /n/poolfs/z/home/apellegr/
>>> openmpi/bin/orted --bootproxy 1 --name 0.0.3 --num_procs 5 --
>>> vpid_start 0 --nodename m45-039.pool --universe apellegr_at_m45-037.pool
>>> :default-universe-29473 --nsreplica "0.0.0;tcp://
>>> 10.11.45.37:55477" --gprreplica "0.0.0;tcp://
>>> 10.11.45.37:55477" [OAR_JOBID=597856 HOST=m45-037.pool TERM=xterm
>>> SHELL=/bin/tcsh OAR_WORKING_DIRECTORY=/home/apellegr
>>> SSH_CLIENT=10.11.0.4 50481 6667 OAR_USER=apellegr
>>> GROUP=csestudents USER=apellegr SUDO_USER=oar OAR_WORKDIR=/home/
>>> apellegr SUDO_UID=30143 HOSTTYPE=i486-linux USERNAME=apellegr
>>> OAR_JOB_NAME= OAR_NODE_FILE=/var/lib/oar/597856
>>> OAR_RESOURCE_PROPERTIES_FILE=/var/lib/oar/597856_resources MAIL=/
>>> var/mail/oar PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/
>>> bin:/sbin:/bin:/usr/X11R6/bin:/n/poolfs/z/home/apellegr/openmpi/
>>> bin:/n/poolfs/z/home/apellegr/openssl/bin OAR_PROJECT_NAME=default
>>> OAR_JOB_WALLTIME_SECONDS=7200 PWD=/home/apellegr HOME=/home/
>>> apellegr SUDO_COMMAND=OAR SHLVL=2 OAR_FILE_NODES=/var/lib/oar/
>>> 597856 OSTYPE=linux VENDOR=intel OAR_JOB_WALLTIME=2:0:0
>>> MACHTYPE=i486 LOGNAME=apellegr OAR_NODEFILE=/var/lib/oar/597856
>>> OAR_RESOURCE_FILE=/var/lib/oar/597856 SUDO_GID=390
>>> OAR_JOB_ID=597856 OAR_O_WORKDIR=/home/apellegr _=/n/poolfs/z/home/
>>> apellegr/openmpi/bin/mpirun OLDPWD=/home/apellegr/openmpi
>>> OMPI_MCA_rds_hostfile_path=/var/lib/oar/597856
>>> OMPI_MCA_pls_rsh_debug=1 OMPI_MCA_pls_rsh_agent=oarsh
>>> OMPI_MCA_seed=0]
>>> bash: -c: line 0: syntax error near unexpected token `('
>>> bash: -c: line 0: ` set path = ( /n/poolfs/z/home/apellegr/openmpi/
>>> bin $path ) ; if ( $?LD_LIBRARY_PATH == 1 ) set OMPI_have_llp ; if
>>> ( $?LD_LIBRARY_PATH == 0 ) setenv LD_LIBRARY_PATH /n/poolfs/z/home/
>>> apellegr/openmpi/lib ; if ( $?OMPI_have_llp == 1 ) setenv
>>> LD_LIBRARY_PATH /n/poolfs/z/home/apellegr/openmpi/lib:
>>> $LD_LIBRARY_PATH ; /n/poolfs/z/home/apellegr/openmpi/bin/orted --
>>> bootproxy 1 --name 0.0.3 --num_procs 5 --vpid_start 0 --nodename
>>> m45-039.pool --universe apellegr_at_m45-037.pool:default-
>>> universe-29473 --nsreplica "0.0.0;tcp://10.11.45.37:55477" --
>>> gprreplica "0.0.0;tcp://10.11.45.37:55477"'
>>> [m45-037.pool:29473] pls:rsh: launching on node m45-040.pool
>>> [m45-037.pool:29473] ERROR: A daemon on node m45-039.pool failed
>>> to start as expected.
>>> [m45-037.pool:29473] ERROR: There may be more information
>>> available from
>>> [m45-037.pool:29473] ERROR: the remote shell (see above).
>>> [m45-037.pool:29473] ERROR: The daemon exited unexpectedly with
>>> status 2.
>>> [m45-037.pool:29473] pls:rsh: m45-040.pool is a REMOTE node
>>> [m45-037.pool:29473] pls:rsh: executing: (//usr/bin/oarsh) /usr/
>>> bin/oarsh m45-040.pool set path = ( /n/poolfs/z/home/apellegr/
>>> openmpi/bin $path ) ; if ( $?LD_LIBRARY_PATH == 1 ) set
>>> OMPI_have_llp ; if ( $?LD_LIBRARY_PATH == 0 ) setenv
>>> LD_LIBRARY_PATH /n/poolfs/z/home/apellegr/openmpi/lib ; if ( $?
>>> OMPI_have_llp == 1 ) setenv LD_LIBRARY_PATH /n/poolfs/z/home/
>>> apellegr/openmpi/lib:$LD_LIBRARY_PATH ; /n/poolfs/z/home/apellegr/
>>> openmpi/bin/orted --bootproxy 1 --name 0.0.4 --num_procs 5 --
>>> vpid_start 0 --nodename m45-040.pool --universe apellegr_at_m45-037.pool
>>> :default-universe-29473 --nsreplica "0.0.0;tcp://
>>> 10.11.45.37:55477" --gprreplica "0.0.0;tcp://
>>> 10.11.45.37:55477" [OAR_JOBID=597856 HOST=m45-037.pool TERM=xterm
>>> SHELL=/bin/tcsh OAR_WORKING_DIRECTORY=/home/apellegr
>>> SSH_CLIENT=10.11.0.4 50481 6667 OAR_USER=apellegr
>>> GROUP=csestudents USER=apellegr SUDO_USER=oar OAR_WORKDIR=/home/
>>> apellegr SUDO_UID=30143 HOSTTYPE=i486-linux USERNAME=apellegr
>>> OAR_JOB_NAME= OAR_NODE_FILE=/var/lib/oar/597856
>>> OAR_RESOURCE_PROPERTIES_FILE=/var/lib/oar/597856_resources MAIL=/
>>> var/mail/oar PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/
>>> bin:/sbin:/bin:/usr/X11R6/bin:/n/poolfs/z/home/apellegr/openmpi/
>>> bin:/n/poolfs/z/home/apellegr/openssl/bin OAR_PROJECT_NAME=default
>>> OAR_JOB_WALLTIME_SECONDS=7200 PWD=/home/apellegr HOME=/home/
>>> apellegr SUDO_COMMAND=OAR SHLVL=2 OAR_FILE_NODES=/var/lib/oar/
>>> 597856 OSTYPE=linux VENDOR=intel OAR_JOB_WALLTIME=2:0:0
>>> MACHTYPE=i486 LOGNAME=apellegr OAR_NODEFILE=/var/lib/oar/597856
>>> OAR_RESOURCE_FILE=/var/lib/oar/597856 SUDO_GID=390
>>> OAR_JOB_ID=597856 OAR_O_WORKDIR=/home/apellegr _=/n/poolfs/z/home/
>>> apellegr/openmpi/bin/mpirun OLDPWD=/home/apellegr/openmpi
>>> OMPI_MCA_rds_hostfile_path=/var/lib/oar/597856
>>> OMPI_MCA_pls_rsh_debug=1 OMPI_MCA_pls_rsh_agent=oarsh
>>> OMPI_MCA_seed=0]
>>> bash: -c: line 0: syntax error near unexpected token `('
>>> bash: -c: line 0: ` set path = ( /n/poolfs/z/home/apellegr/openmpi/
>>> bin $path ) ; if ( $?LD_LIBRARY_PATH == 1 ) set OMPI_have_llp ; if
>>> ( $?LD_LIBRARY_PATH == 0 ) setenv LD_LIBRARY_PATH /n/poolfs/z/home/
>>> apellegr/openmpi/lib ; if ( $?OMPI_have_llp == 1 ) setenv
>>> LD_LIBRARY_PATH /n/poolfs/z/home/apellegr/openmpi/lib:
>>> $LD_LIBRARY_PATH ; /n/poolfs/z/home/apellegr/openmpi/bin/orted --
>>> bootproxy 1 --name 0.0.4 --num_procs 5 --vpid_start 0 --nodename
>>> m45-040.pool --universe apellegr_at_m45-037.pool:default-
>>> universe-29473 --nsreplica "0.0.0;tcp://10.11.45.37:55477" --
>>> gprreplica "0.0.0;tcp://10.11.45.37:55477"'
>>> [m45-037.pool:29473] ERROR: A daemon on node m45-040.pool failed
>>> to start as expected.
>>> [m45-037.pool:29473] ERROR: There may be more information
>>> available from
>>> [m45-037.pool:29473] ERROR: the remote shell (see above).
>>> [m45-037.pool:29473] ERROR: The daemon exited unexpectedly with
>>> status 2.
>>> [m45-037.pool:29473] [0,0,0] ORTE_ERROR_LOG: Timeout in
>>> file ../../../../orte/mca/pls/base/pls_base_orted_cmds.c at line 188
>>> [m45-037.pool:29473] [0,0,0] ORTE_ERROR_LOG: Timeout in
>>> file ../../../../../orte/mca/pls/rsh/pls_rsh_module.c at line 1190
>>> --------------------------------------------------------------------------
>>> mpirun was unable to cleanly terminate the daemons for this job.
>>> Returned value Timeout instead of ORTE_SUCCESS.
>>> --------------------------------------------------------------------------
>>> apellegr_at_m45-037:~$
>>>
>>> Can anybody help me?
>>> Thanks,
>>> ~Andrea
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
Cisco Systems