Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] OpenMPI and OAR issues
From: Andrea Pellegrini (andrea.pellegrini_at_[hidden])
Date: 2008-11-06 15:47:45


Thanks guys!
I finally fixed my problem!!!

apellegr_at_m45-039:~$ mpirun -prefix ~/openmpi -machinefile
$OAR_FILE_NODES -mca pls_rsh_assume_same_shell 0 -mca pls_rsh_agent
"oarsh" -np 2 /n/poolfs/z/home/apellegr/mpi_test/hello_world.x86
Warning: Permanently added '[m45-039.pool]:6667' (RSA) to the list of
known hosts.
m45-039.pool: hello world from rank 0
m45-040.pool: hello world from rank 1
apellegr_at_m45-039:~$

It was a problem with the consoles.
Thanks again!
~Andrea

Ralph Castain wrote:
> Thanks for the OAR explanation!
>
> Sorry - I should have been clearer in my comment. I was trying to
> indicate that the cmd starting with "set" is indicating a bash syntax
> error, and that is why the launch fails.
>
> The rsh launcher uses a little "probe" technique to try and guess the
> remote shell. Apparently, it thinks this is tcsh, while the remote
> node thinks it will use bash.
>
> Are you running this from bash? If so, you could perhaps resolve the
> problem by specifying -mca pls_rsh_assume_same_shell 1 on your command
> line. This will override the probe and force the system to use the
> syntax appropriate to the same shell you used for mpirun.
>
> Alternatively, you could set -mca pls_rsh_debug 1 to see all the debug
> output as the system probes your remote shell. Might help you figure
> out why it thinks it is tcsh.
>
> Ralph
>
>
> On Nov 6, 2008, at 1:31 PM, George Bosilca wrote:
>
>> OAR is the batch scheduler used on the Grid5K platform. As far as I
>> know, set is a basic shell internal command, and it is understood by
>> all shells. The problem here seems to be that somehow we're using
>> bash, but with a tcsh shell code (because setenv is definitively not
>> something that bash understand).
>>
>> george.
>>
>> On Nov 6, 2008, at 3:07 PM, Ralph Castain wrote:
>>
>>> I have no idea what "oar" is, but it looks to me like the rsh
>>> launcher is getting confused about the remote shell it will use - I
>>> don't believe that the "set" cmd shown below is proper bash syntax,
>>> and that is the error that is causing the launch to fail.
>>>
>>> What remote shell should it fine? I know we don't have any "oar"
>>> shell-specific code in the system, but maybe it looks like something
>>> else?
>>>
>>> On Nov 6, 2008, at 12:55 PM, Andrea Pellegrini wrote:
>>>
>>>> Hi all,
>>>> I'm trying to run an openmpi application on a oar cluster. I think
>>>> the cluster is configured correctly but I still have problems when
>>>> I run mpirun:
>>>>
>>>> apellegr_at_m45-037:~$ mpirun -prefix
>>>> /n/poolfs/z/home/apellegr/openmpi -machinefile $OAR_FILE_NODES -mca
>>>> pls_rsh_agent "oarsh" -np 10
>>>> /n/poolfs/z/home/apellegr/mpi_test/hello_world.x86 bash: -c: line
>>>> 0: syntax error near unexpected token `('
>>>> bash: -c: line 0: ` set path = (
>>>> /n/poolfs/z/home/apellegr/openmpi/bin $path ) ; if (
>>>> $?LD_LIBRARY_PATH == 1 ) set OMPI_have_llp ; if ( $?LD_LIBRARY_PATH
>>>> == 0 ) setenv LD_LIBRARY_PATH /n/poolfs/z/home/apellegr/openmpi/lib
>>>> ; if ( $?OMPI_have_llp == 1 ) setenv LD_LIBRARY_PATH
>>>> /n/poolfs/z/home/apellegr/openmpi/lib:$LD_LIBRARY_PATH ;
>>>> /n/poolfs/z/home/apellegr/openmpi/bin/orted --bootproxy 1 --name
>>>> 0.0.4 --num_procs 5 --vpid_start 0 --nodename m45-040.pool
>>>> --universe apellegr_at_m45-037.pool:default-universe-29482 --nsreplica
>>>> "0.0.0;tcp://10.11.45.37:36790" --gprreplica
>>>> "0.0.0;tcp://10.11.45.37:36790"'
>>>> bash: -c: line 0: syntax error near unexpected token `('
>>>> bash: -c: line 0: ` set path = (
>>>> /n/poolfs/z/home/apellegr/openmpi/bin $path ) ; if (
>>>> $?LD_LIBRARY_PATH == 1 ) set OMPI_have_llp ; if ( $?LD_LIBRARY_PATH
>>>> == 0 ) setenv LD_LIBRARY_PATH /n/poolfs/z/home/apellegr/openmpi/lib
>>>> ; if ( $?OMPI_have_llp == 1 ) setenv LD_LIBRARY_PATH
>>>> /n/poolfs/z/home/apellegr/openmpi/lib:$LD_LIBRARY_PATH ;
>>>> /n/poolfs/z/home/apellegr/openmpi/bin/orted --bootproxy 1 --name
>>>> 0.0.2 --num_procs 5 --vpid_start 0 --nodename m45-038.pool
>>>> --universe apellegr_at_m45-037.pool:default-universe-29482 --nsreplica
>>>> "0.0.0;tcp://10.11.45.37:36790" --gprreplica
>>>> "0.0.0;tcp://10.11.45.37:36790"'
>>>> [m45-037.pool:29482] ERROR: A daemon on node m45-038.pool failed to
>>>> start as expected.
>>>> [m45-037.pool:29482] ERROR: There may be more information available
>>>> from
>>>> [m45-037.pool:29482] ERROR: the remote shell (see above).
>>>> [m45-037.pool:29482] ERROR: The daemon exited unexpectedly with
>>>> status 2.
>>>> bash: -c: line 0: syntax error near unexpected token `('
>>>> bash: -c: line 0: ` set path = (
>>>> /n/poolfs/z/home/apellegr/openmpi/bin $path ) ; if (
>>>> $?LD_LIBRARY_PATH == 1 ) set OMPI_have_llp ; if ( $?LD_LIBRARY_PATH
>>>> == 0 ) setenv LD_LIBRARY_PATH /n/poolfs/z/home/apellegr/openmpi/lib
>>>> ; if ( $?OMPI_have_llp == 1 ) setenv LD_LIBRARY_PATH
>>>> /n/poolfs/z/home/apellegr/openmpi/lib:$LD_LIBRARY_PATH ;
>>>> /n/poolfs/z/home/apellegr/openmpi/bin/orted --bootproxy 1 --name
>>>> 0.0.3 --num_procs 5 --vpid_start 0 --nodename m45-039.pool
>>>> --universe apellegr_at_m45-037.pool:default-universe-29482 --nsreplica
>>>> "0.0.0;tcp://10.11.45.37:36790" --gprreplica
>>>> "0.0.0;tcp://10.11.45.37:36790"'
>>>> [m45-037.pool:29482] ERROR: A daemon on node m45-039.pool failed to
>>>> start as expected.
>>>> [m45-037.pool:29482] ERROR: There may be more information available
>>>> from
>>>> [m45-037.pool:29482] ERROR: the remote shell (see above).
>>>> [m45-037.pool:29482] ERROR: The daemon exited unexpectedly with
>>>> status 2.
>>>> [m45-037.pool:29482] [0,0,0] ORTE_ERROR_LOG: Timeout in file
>>>> ../../../../orte/mca/pls/base/pls_base_orted_cmds.c at line 275
>>>> [m45-037.pool:29482] [0,0,0] ORTE_ERROR_LOG: Timeout in file
>>>> ../../../../../orte/mca/pls/rsh/pls_rsh_module.c at line 1158
>>>> [m45-037.pool:29482] [0,0,0] ORTE_ERROR_LOG: Timeout in file
>>>> ../../../../../orte/mca/errmgr/hnp/errmgr_hnp.c at line 90
>>>> [m45-037.pool:29482] ERROR: A daemon on node m45-040.pool failed to
>>>> start as expected.
>>>> [m45-037.pool:29482] ERROR: There may be more information available
>>>> from
>>>> [m45-037.pool:29482] ERROR: the remote shell (see above).
>>>> [m45-037.pool:29482] ERROR: The daemon exited unexpectedly with
>>>> status 2.
>>>> [m45-037.pool:29482] [0,0,0] ORTE_ERROR_LOG: Timeout in file
>>>> ../../../../orte/mca/pls/base/pls_base_orted_cmds.c at line 188
>>>> [m45-037.pool:29482] [0,0,0] ORTE_ERROR_LOG: Timeout in file
>>>> ../../../../../orte/mca/pls/rsh/pls_rsh_module.c at line 1190
>>>> --------------------------------------------------------------------------
>>>>
>>>> mpirun was unable to cleanly terminate the daemons for this job.
>>>> Returned value Timeout instead of ORTE_SUCCESS.
>>>> --------------------------------------------------------------------------
>>>>
>>>> apellegr_at_m45-037:~$
>>>>
>>>>
>>>> If I run it with the option "-mca pls_rsh_debug 1" I get:
>>>>
>>>> apellegr_at_m45-037:~$ mpirun -prefix
>>>> /n/poolfs/z/home/apellegr/openmpi -machinefile $OAR_FILE_NODES -mca
>>>> pls_rsh_debug 1 -mca pls_rsh_agent "oarsh" -np 10
>>>> /n/poolfs/z/home/apellegr/mpi_test/hello_world.x86
>>>> [m45-037.pool:29473] pls:rsh: local shell: 2 (tcsh)
>>>> [m45-037.pool:29473] pls:rsh: assuming same remote shell as local
>>>> shell
>>>> [m45-037.pool:29473] pls:rsh: remote shell: 2 (tcsh)
>>>> [m45-037.pool:29473] pls:rsh: final template argv:
>>>> [m45-037.pool:29473] pls:rsh: /usr/bin/oarsh <template> orted
>>>> --bootproxy 1 --name <template> --num_procs 5 --vpid_start 0
>>>> --nodename <template> --universe
>>>> apellegr_at_m45-037.pool:default-universe-29473 --nsreplica
>>>> "0.0.0;tcp://10.11.45.37:55477" --gprreplica
>>>> "0.0.0;tcp://10.11.45.37:55477"
>>>> [m45-037.pool:29473] pls:rsh: launching on node m45-037.pool
>>>> [m45-037.pool:29473] pls:rsh: m45-037.pool is a LOCAL node
>>>> [m45-037.pool:29473] pls:rsh: reset PATH:
>>>> /n/poolfs/z/home/apellegr/openmpi/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/X11R6/bin:/n/poolfs/z/home/apellegr/openmpi/bin:/n/poolfs/z/home/apellegr/openssl/bin
>>>>
>>>> [m45-037.pool:29473] pls:rsh: reset LD_LIBRARY_PATH:
>>>> /n/poolfs/z/home/apellegr/openmpi/lib
>>>> [m45-037.pool:29473] pls:rsh: changing to directory /home/apellegr
>>>> [m45-037.pool:29473] pls:rsh: executing:
>>>> (/n/poolfs/z/home/apellegr/openmpi/bin/orted) orted --bootproxy 1
>>>> --name 0.0.1 --num_procs 5 --vpid_start 0 --nodename m45-037.pool
>>>> --universe apellegr_at_m45-037.pool:default-universe-29473 --nsreplica
>>>> "0.0.0;tcp://10.11.45.37:55477" --gprreplica
>>>> "0.0.0;tcp://10.11.45.37:55477" --set-sid [OAR_JOBID=597856
>>>> HOST=m45-037.pool TERM=xterm SHELL=/bin/tcsh
>>>> OAR_WORKING_DIRECTORY=/home/apellegr SSH_CLIENT=10.11.0.4 50481
>>>> 6667 OAR_USER=apellegr GROUP=csestudents USER=apellegr
>>>> SUDO_USER=oar OAR_WORKDIR=/home/apellegr SUDO_UID=30143
>>>> HOSTTYPE=i486-linux USERNAME=apellegr OAR_JOB_NAME=
>>>> OAR_NODE_FILE=/var/lib/oar/597856
>>>> OAR_RESOURCE_PROPERTIES_FILE=/var/lib/oar/597856_resources
>>>> MAIL=/var/mail/oar
>>>> PATH=/n/poolfs/z/home/apellegr/openmpi/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/X11R6/bin:/n/poolfs/z/home/apellegr/openmpi/bin:/n/poolfs/z/home/apellegr/openssl/bin
>>>> OAR_PROJECT_NAME=default OAR_JOB_WALLTIME_SECONDS=7200
>>>> PWD=/home/apellegr HOME=/home/apellegr SUDO_COMMAND=OAR SHLVL=2
>>>> OAR_FILE_NODES=/var/lib/oar/597856 OSTYPE=linux VENDOR=intel
>>>> OAR_JOB_WALLTIME=2:0:0 MACHTYPE=i486 LOGNAME=apellegr
>>>> OAR_NODEFILE=/var/lib/oar/597856
>>>> OAR_RESOURCE_FILE=/var/lib/oar/597856 SUDO_GID=390
>>>> OAR_JOB_ID=597856 OAR_O_WORKDIR=/home/apellegr
>>>> _=/n/poolfs/z/home/apellegr/openmpi/bin/mpirun
>>>> OLDPWD=/home/apellegr/openmpi
>>>> OMPI_MCA_rds_hostfile_path=/var/lib/oar/597856
>>>> OMPI_MCA_pls_rsh_debug=1 OMPI_MCA_pls_rsh_agent=oarsh
>>>> LD_LIBRARY_PATH=/n/poolfs/z/home/apellegr/openmpi/lib OMPI_MCA_seed=0]
>>>> [m45-037.pool:29473] pls:rsh: launching on node m45-038.pool
>>>> [m45-037.pool:29473] pls:rsh: m45-038.pool is a REMOTE node
>>>> [m45-037.pool:29473] pls:rsh: executing: (//usr/bin/oarsh)
>>>> /usr/bin/oarsh m45-038.pool set path = (
>>>> /n/poolfs/z/home/apellegr/openmpi/bin $path ) ; if (
>>>> $?LD_LIBRARY_PATH == 1 ) set OMPI_have_llp ; if ( $?LD_LIBRARY_PATH
>>>> == 0 ) setenv LD_LIBRARY_PATH /n/poolfs/z/home/apellegr/openmpi/lib
>>>> ; if ( $?OMPI_have_llp == 1 ) setenv LD_LIBRARY_PATH
>>>> /n/poolfs/z/home/apellegr/openmpi/lib:$LD_LIBRARY_PATH ;
>>>> /n/poolfs/z/home/apellegr/openmpi/bin/orted --bootproxy 1 --name
>>>> 0.0.2 --num_procs 5 --vpid_start 0 --nodename m45-038.pool
>>>> --universe apellegr_at_m45-037.pool:default-universe-29473 --nsreplica
>>>> "0.0.0;tcp://10.11.45.37:55477" --gprreplica
>>>> "0.0.0;tcp://10.11.45.37:55477" [OAR_JOBID=597856 HOST=m45-037.pool
>>>> TERM=xterm SHELL=/bin/tcsh OAR_WORKING_DIRECTORY=/home/apellegr
>>>> SSH_CLIENT=10.11.0.4 50481 6667 OAR_USER=apellegr GROUP=csestudents
>>>> USER=apellegr SUDO_USER=oar OAR_WORKDIR=/home/apellegr
>>>> SUDO_UID=30143 HOSTTYPE=i486-linux USERNAME=apellegr OAR_JOB_NAME=
>>>> OAR_NODE_FILE=/var/lib/oar/597856
>>>> OAR_RESOURCE_PROPERTIES_FILE=/var/lib/oar/597856_resources
>>>> MAIL=/var/mail/oar
>>>> PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/X11R6/bin:/n/poolfs/z/home/apellegr/openmpi/bin:/n/poolfs/z/home/apellegr/openssl/bin
>>>> OAR_PROJECT_NAME=default OAR_JOB_WALLTIME_SECONDS=7200
>>>> PWD=/home/apellegr HOME=/home/apellegr SUDO_COMMAND=OAR SHLVL=2
>>>> OAR_FILE_NODES=/var/lib/oar/597856 OSTYPE=linux VENDOR=intel
>>>> OAR_JOB_WALLTIME=2:0:0 MACHTYPE=i486 LOGNAME=apellegr
>>>> OAR_NODEFILE=/var/lib/oar/597856
>>>> OAR_RESOURCE_FILE=/var/lib/oar/597856 SUDO_GID=390
>>>> OAR_JOB_ID=597856 OAR_O_WORKDIR=/home/apellegr
>>>> _=/n/poolfs/z/home/apellegr/openmpi/bin/mpirun
>>>> OLDPWD=/home/apellegr/openmpi
>>>> OMPI_MCA_rds_hostfile_path=/var/lib/oar/597856
>>>> OMPI_MCA_pls_rsh_debug=1 OMPI_MCA_pls_rsh_agent=oarsh OMPI_MCA_seed=0]
>>>> bash: -c: line 0: syntax error near unexpected token `('
>>>> bash: -c: line 0: ` set path = (
>>>> /n/poolfs/z/home/apellegr/openmpi/bin $path ) ; if (
>>>> $?LD_LIBRARY_PATH == 1 ) set OMPI_have_llp ; if ( $?LD_LIBRARY_PATH
>>>> == 0 ) setenv LD_LIBRARY_PATH /n/poolfs/z/home/apellegr/openmpi/lib
>>>> ; if ( $?OMPI_have_llp == 1 ) setenv LD_LIBRARY_PATH
>>>> /n/poolfs/z/home/apellegr/openmpi/lib:$LD_LIBRARY_PATH ;
>>>> /n/poolfs/z/home/apellegr/openmpi/bin/orted --bootproxy 1 --name
>>>> 0.0.2 --num_procs 5 --vpid_start 0 --nodename m45-038.pool
>>>> --universe apellegr_at_m45-037.pool:default-universe-29473 --nsreplica
>>>> "0.0.0;tcp://10.11.45.37:55477" --gprreplica
>>>> "0.0.0;tcp://10.11.45.37:55477"'
>>>> [m45-037.pool:29473] pls:rsh: launching on node m45-039.pool
>>>> [m45-037.pool:29473] ERROR: A daemon on node m45-038.pool failed to
>>>> start as expected.
>>>> [m45-037.pool:29473] ERROR: There may be more information available
>>>> from
>>>> [m45-037.pool:29473] ERROR: the remote shell (see above).
>>>> [m45-037.pool:29473] ERROR: The daemon exited unexpectedly with
>>>> status 2.
>>>> [m45-037.pool:29473] pls:rsh: m45-039.pool is a REMOTE node
>>>> [m45-037.pool:29473] pls:rsh: executing: (//usr/bin/oarsh)
>>>> /usr/bin/oarsh m45-039.pool set path = (
>>>> /n/poolfs/z/home/apellegr/openmpi/bin $path ) ; if (
>>>> $?LD_LIBRARY_PATH == 1 ) set OMPI_have_llp ; if ( $?LD_LIBRARY_PATH
>>>> == 0 ) setenv LD_LIBRARY_PATH /n/poolfs/z/home/apellegr/openmpi/lib
>>>> ; if ( $?OMPI_have_llp == 1 ) setenv LD_LIBRARY_PATH
>>>> /n/poolfs/z/home/apellegr/openmpi/lib:$LD_LIBRARY_PATH ;
>>>> /n/poolfs/z/home/apellegr/openmpi/bin/orted --bootproxy 1 --name
>>>> 0.0.3 --num_procs 5 --vpid_start 0 --nodename m45-039.pool
>>>> --universe apellegr_at_m45-037.pool:default-universe-29473 --nsreplica
>>>> "0.0.0;tcp://10.11.45.37:55477" --gprreplica
>>>> "0.0.0;tcp://10.11.45.37:55477" [OAR_JOBID=597856 HOST=m45-037.pool
>>>> TERM=xterm SHELL=/bin/tcsh OAR_WORKING_DIRECTORY=/home/apellegr
>>>> SSH_CLIENT=10.11.0.4 50481 6667 OAR_USER=apellegr GROUP=csestudents
>>>> USER=apellegr SUDO_USER=oar OAR_WORKDIR=/home/apellegr
>>>> SUDO_UID=30143 HOSTTYPE=i486-linux USERNAME=apellegr OAR_JOB_NAME=
>>>> OAR_NODE_FILE=/var/lib/oar/597856
>>>> OAR_RESOURCE_PROPERTIES_FILE=/var/lib/oar/597856_resources
>>>> MAIL=/var/mail/oar
>>>> PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/X11R6/bin:/n/poolfs/z/home/apellegr/openmpi/bin:/n/poolfs/z/home/apellegr/openssl/bin
>>>> OAR_PROJECT_NAME=default OAR_JOB_WALLTIME_SECONDS=7200
>>>> PWD=/home/apellegr HOME=/home/apellegr SUDO_COMMAND=OAR SHLVL=2
>>>> OAR_FILE_NODES=/var/lib/oar/597856 OSTYPE=linux VENDOR=intel
>>>> OAR_JOB_WALLTIME=2:0:0 MACHTYPE=i486 LOGNAME=apellegr
>>>> OAR_NODEFILE=/var/lib/oar/597856
>>>> OAR_RESOURCE_FILE=/var/lib/oar/597856 SUDO_GID=390
>>>> OAR_JOB_ID=597856 OAR_O_WORKDIR=/home/apellegr
>>>> _=/n/poolfs/z/home/apellegr/openmpi/bin/mpirun
>>>> OLDPWD=/home/apellegr/openmpi
>>>> OMPI_MCA_rds_hostfile_path=/var/lib/oar/597856
>>>> OMPI_MCA_pls_rsh_debug=1 OMPI_MCA_pls_rsh_agent=oarsh OMPI_MCA_seed=0]
>>>> bash: -c: line 0: syntax error near unexpected token `('
>>>> bash: -c: line 0: ` set path = (
>>>> /n/poolfs/z/home/apellegr/openmpi/bin $path ) ; if (
>>>> $?LD_LIBRARY_PATH == 1 ) set OMPI_have_llp ; if ( $?LD_LIBRARY_PATH
>>>> == 0 ) setenv LD_LIBRARY_PATH /n/poolfs/z/home/apellegr/openmpi/lib
>>>> ; if ( $?OMPI_have_llp == 1 ) setenv LD_LIBRARY_PATH
>>>> /n/poolfs/z/home/apellegr/openmpi/lib:$LD_LIBRARY_PATH ;
>>>> /n/poolfs/z/home/apellegr/openmpi/bin/orted --bootproxy 1 --name
>>>> 0.0.3 --num_procs 5 --vpid_start 0 --nodename m45-039.pool
>>>> --universe apellegr_at_m45-037.pool:default-universe-29473 --nsreplica
>>>> "0.0.0;tcp://10.11.45.37:55477" --gprreplica
>>>> "0.0.0;tcp://10.11.45.37:55477"'
>>>> [m45-037.pool:29473] pls:rsh: launching on node m45-040.pool
>>>> [m45-037.pool:29473] ERROR: A daemon on node m45-039.pool failed to
>>>> start as expected.
>>>> [m45-037.pool:29473] ERROR: There may be more information available
>>>> from
>>>> [m45-037.pool:29473] ERROR: the remote shell (see above).
>>>> [m45-037.pool:29473] ERROR: The daemon exited unexpectedly with
>>>> status 2.
>>>> [m45-037.pool:29473] pls:rsh: m45-040.pool is a REMOTE node
>>>> [m45-037.pool:29473] pls:rsh: executing: (//usr/bin/oarsh)
>>>> /usr/bin/oarsh m45-040.pool set path = (
>>>> /n/poolfs/z/home/apellegr/openmpi/bin $path ) ; if (
>>>> $?LD_LIBRARY_PATH == 1 ) set OMPI_have_llp ; if ( $?LD_LIBRARY_PATH
>>>> == 0 ) setenv LD_LIBRARY_PATH /n/poolfs/z/home/apellegr/openmpi/lib
>>>> ; if ( $?OMPI_have_llp == 1 ) setenv LD_LIBRARY_PATH
>>>> /n/poolfs/z/home/apellegr/openmpi/lib:$LD_LIBRARY_PATH ;
>>>> /n/poolfs/z/home/apellegr/openmpi/bin/orted --bootproxy 1 --name
>>>> 0.0.4 --num_procs 5 --vpid_start 0 --nodename m45-040.pool
>>>> --universe apellegr_at_m45-037.pool:default-universe-29473 --nsreplica
>>>> "0.0.0;tcp://10.11.45.37:55477" --gprreplica
>>>> "0.0.0;tcp://10.11.45.37:55477" [OAR_JOBID=597856 HOST=m45-037.pool
>>>> TERM=xterm SHELL=/bin/tcsh OAR_WORKING_DIRECTORY=/home/apellegr
>>>> SSH_CLIENT=10.11.0.4 50481 6667 OAR_USER=apellegr GROUP=csestudents
>>>> USER=apellegr SUDO_USER=oar OAR_WORKDIR=/home/apellegr
>>>> SUDO_UID=30143 HOSTTYPE=i486-linux USERNAME=apellegr OAR_JOB_NAME=
>>>> OAR_NODE_FILE=/var/lib/oar/597856
>>>> OAR_RESOURCE_PROPERTIES_FILE=/var/lib/oar/597856_resources
>>>> MAIL=/var/mail/oar
>>>> PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/X11R6/bin:/n/poolfs/z/home/apellegr/openmpi/bin:/n/poolfs/z/home/apellegr/openssl/bin
>>>> OAR_PROJECT_NAME=default OAR_JOB_WALLTIME_SECONDS=7200
>>>> PWD=/home/apellegr HOME=/home/apellegr SUDO_COMMAND=OAR SHLVL=2
>>>> OAR_FILE_NODES=/var/lib/oar/597856 OSTYPE=linux VENDOR=intel
>>>> OAR_JOB_WALLTIME=2:0:0 MACHTYPE=i486 LOGNAME=apellegr
>>>> OAR_NODEFILE=/var/lib/oar/597856
>>>> OAR_RESOURCE_FILE=/var/lib/oar/597856 SUDO_GID=390
>>>> OAR_JOB_ID=597856 OAR_O_WORKDIR=/home/apellegr
>>>> _=/n/poolfs/z/home/apellegr/openmpi/bin/mpirun
>>>> OLDPWD=/home/apellegr/openmpi
>>>> OMPI_MCA_rds_hostfile_path=/var/lib/oar/597856
>>>> OMPI_MCA_pls_rsh_debug=1 OMPI_MCA_pls_rsh_agent=oarsh OMPI_MCA_seed=0]
>>>> bash: -c: line 0: syntax error near unexpected token `('
>>>> bash: -c: line 0: ` set path = (
>>>> /n/poolfs/z/home/apellegr/openmpi/bin $path ) ; if (
>>>> $?LD_LIBRARY_PATH == 1 ) set OMPI_have_llp ; if ( $?LD_LIBRARY_PATH
>>>> == 0 ) setenv LD_LIBRARY_PATH /n/poolfs/z/home/apellegr/openmpi/lib
>>>> ; if ( $?OMPI_have_llp == 1 ) setenv LD_LIBRARY_PATH
>>>> /n/poolfs/z/home/apellegr/openmpi/lib:$LD_LIBRARY_PATH ;
>>>> /n/poolfs/z/home/apellegr/openmpi/bin/orted --bootproxy 1 --name
>>>> 0.0.4 --num_procs 5 --vpid_start 0 --nodename m45-040.pool
>>>> --universe apellegr_at_m45-037.pool:default-universe-29473 --nsreplica
>>>> "0.0.0;tcp://10.11.45.37:55477" --gprreplica
>>>> "0.0.0;tcp://10.11.45.37:55477"'
>>>> [m45-037.pool:29473] ERROR: A daemon on node m45-040.pool failed to
>>>> start as expected.
>>>> [m45-037.pool:29473] ERROR: There may be more information available
>>>> from
>>>> [m45-037.pool:29473] ERROR: the remote shell (see above).
>>>> [m45-037.pool:29473] ERROR: The daemon exited unexpectedly with
>>>> status 2.
>>>> [m45-037.pool:29473] [0,0,0] ORTE_ERROR_LOG: Timeout in file
>>>> ../../../../orte/mca/pls/base/pls_base_orted_cmds.c at line 188
>>>> [m45-037.pool:29473] [0,0,0] ORTE_ERROR_LOG: Timeout in file
>>>> ../../../../../orte/mca/pls/rsh/pls_rsh_module.c at line 1190
>>>> --------------------------------------------------------------------------
>>>>
>>>> mpirun was unable to cleanly terminate the daemons for this job.
>>>> Returned value Timeout instead of ORTE_SUCCESS.
>>>> --------------------------------------------------------------------------
>>>>
>>>> apellegr_at_m45-037:~$
>>>>
>>>> Can anybody help me?
>>>> Thanks,
>>>> ~Andrea
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>