Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] OpenMPI and OAR issues
From: George Bosilca (bosilca_at_[hidden])
Date: 2008-11-06 15:31:27


OAR is the batch scheduler used on the Grid5K platform. As far as I
know, set is a basic shell internal command, and it is understood by
all shells. The problem here seems to be that somehow we're using
bash, but with a tcsh shell code (because setenv is definitively not
something that bash understand).

   george.

On Nov 6, 2008, at 3:07 PM, Ralph Castain wrote:

> I have no idea what "oar" is, but it looks to me like the rsh
> launcher is getting confused about the remote shell it will use - I
> don't believe that the "set" cmd shown below is proper bash syntax,
> and that is the error that is causing the launch to fail.
>
> What remote shell should it fine? I know we don't have any "oar"
> shell-specific code in the system, but maybe it looks like something
> else?
>
> On Nov 6, 2008, at 12:55 PM, Andrea Pellegrini wrote:
>
>> Hi all,
>> I'm trying to run an openmpi application on a oar cluster. I think
>> the cluster is configured correctly but I still have problems when
>> I run mpirun:
>>
>> apellegr_at_m45-037:~$ mpirun -prefix /n/poolfs/z/home/apellegr/
>> openmpi -machinefile $OAR_FILE_NODES -mca pls_rsh_agent "oarsh" -np
>> 10 /n/poolfs/z/home/apellegr/mpi_test/hello_world.x86 bash: -c:
>> line 0: syntax error near unexpected token `('
>> bash: -c: line 0: ` set path = ( /n/poolfs/z/home/apellegr/openmpi/
>> bin $path ) ; if ( $?LD_LIBRARY_PATH == 1 ) set OMPI_have_llp ; if
>> ( $?LD_LIBRARY_PATH == 0 ) setenv LD_LIBRARY_PATH /n/poolfs/z/home/
>> apellegr/openmpi/lib ; if ( $?OMPI_have_llp == 1 ) setenv
>> LD_LIBRARY_PATH /n/poolfs/z/home/apellegr/openmpi/lib:
>> $LD_LIBRARY_PATH ; /n/poolfs/z/home/apellegr/openmpi/bin/orted --
>> bootproxy 1 --name 0.0.4 --num_procs 5 --vpid_start 0 --nodename
>> m45-040.pool --universe apellegr_at_m45-037.pool:default-
>> universe-29482 --nsreplica "0.0.0;tcp://10.11.45.37:36790" --
>> gprreplica "0.0.0;tcp://10.11.45.37:36790"'
>> bash: -c: line 0: syntax error near unexpected token `('
>> bash: -c: line 0: ` set path = ( /n/poolfs/z/home/apellegr/openmpi/
>> bin $path ) ; if ( $?LD_LIBRARY_PATH == 1 ) set OMPI_have_llp ; if
>> ( $?LD_LIBRARY_PATH == 0 ) setenv LD_LIBRARY_PATH /n/poolfs/z/home/
>> apellegr/openmpi/lib ; if ( $?OMPI_have_llp == 1 ) setenv
>> LD_LIBRARY_PATH /n/poolfs/z/home/apellegr/openmpi/lib:
>> $LD_LIBRARY_PATH ; /n/poolfs/z/home/apellegr/openmpi/bin/orted --
>> bootproxy 1 --name 0.0.2 --num_procs 5 --vpid_start 0 --nodename
>> m45-038.pool --universe apellegr_at_m45-037.pool:default-
>> universe-29482 --nsreplica "0.0.0;tcp://10.11.45.37:36790" --
>> gprreplica "0.0.0;tcp://10.11.45.37:36790"'
>> [m45-037.pool:29482] ERROR: A daemon on node m45-038.pool failed to
>> start as expected.
>> [m45-037.pool:29482] ERROR: There may be more information available
>> from
>> [m45-037.pool:29482] ERROR: the remote shell (see above).
>> [m45-037.pool:29482] ERROR: The daemon exited unexpectedly with
>> status 2.
>> bash: -c: line 0: syntax error near unexpected token `('
>> bash: -c: line 0: ` set path = ( /n/poolfs/z/home/apellegr/openmpi/
>> bin $path ) ; if ( $?LD_LIBRARY_PATH == 1 ) set OMPI_have_llp ; if
>> ( $?LD_LIBRARY_PATH == 0 ) setenv LD_LIBRARY_PATH /n/poolfs/z/home/
>> apellegr/openmpi/lib ; if ( $?OMPI_have_llp == 1 ) setenv
>> LD_LIBRARY_PATH /n/poolfs/z/home/apellegr/openmpi/lib:
>> $LD_LIBRARY_PATH ; /n/poolfs/z/home/apellegr/openmpi/bin/orted --
>> bootproxy 1 --name 0.0.3 --num_procs 5 --vpid_start 0 --nodename
>> m45-039.pool --universe apellegr_at_m45-037.pool:default-
>> universe-29482 --nsreplica "0.0.0;tcp://10.11.45.37:36790" --
>> gprreplica "0.0.0;tcp://10.11.45.37:36790"'
>> [m45-037.pool:29482] ERROR: A daemon on node m45-039.pool failed to
>> start as expected.
>> [m45-037.pool:29482] ERROR: There may be more information available
>> from
>> [m45-037.pool:29482] ERROR: the remote shell (see above).
>> [m45-037.pool:29482] ERROR: The daemon exited unexpectedly with
>> status 2.
>> [m45-037.pool:29482] [0,0,0] ORTE_ERROR_LOG: Timeout in
>> file ../../../../orte/mca/pls/base/pls_base_orted_cmds.c at line 275
>> [m45-037.pool:29482] [0,0,0] ORTE_ERROR_LOG: Timeout in
>> file ../../../../../orte/mca/pls/rsh/pls_rsh_module.c at line 1158
>> [m45-037.pool:29482] [0,0,0] ORTE_ERROR_LOG: Timeout in
>> file ../../../../../orte/mca/errmgr/hnp/errmgr_hnp.c at line 90
>> [m45-037.pool:29482] ERROR: A daemon on node m45-040.pool failed to
>> start as expected.
>> [m45-037.pool:29482] ERROR: There may be more information available
>> from
>> [m45-037.pool:29482] ERROR: the remote shell (see above).
>> [m45-037.pool:29482] ERROR: The daemon exited unexpectedly with
>> status 2.
>> [m45-037.pool:29482] [0,0,0] ORTE_ERROR_LOG: Timeout in
>> file ../../../../orte/mca/pls/base/pls_base_orted_cmds.c at line 188
>> [m45-037.pool:29482] [0,0,0] ORTE_ERROR_LOG: Timeout in
>> file ../../../../../orte/mca/pls/rsh/pls_rsh_module.c at line 1190
>> --------------------------------------------------------------------------
>> mpirun was unable to cleanly terminate the daemons for this job.
>> Returned value Timeout instead of ORTE_SUCCESS.
>> --------------------------------------------------------------------------
>> apellegr_at_m45-037:~$
>>
>>
>> If I run it with the option "-mca pls_rsh_debug 1" I get:
>>
>> apellegr_at_m45-037:~$ mpirun -prefix /n/poolfs/z/home/apellegr/
>> openmpi -machinefile $OAR_FILE_NODES -mca pls_rsh_debug 1 -mca
>> pls_rsh_agent "oarsh" -np 10 /n/poolfs/z/home/apellegr/mpi_test/
>> hello_world.x86
>> [m45-037.pool:29473] pls:rsh: local shell: 2 (tcsh)
>> [m45-037.pool:29473] pls:rsh: assuming same remote shell as local
>> shell
>> [m45-037.pool:29473] pls:rsh: remote shell: 2 (tcsh)
>> [m45-037.pool:29473] pls:rsh: final template argv:
>> [m45-037.pool:29473] pls:rsh: /usr/bin/oarsh <template> orted --
>> bootproxy 1 --name <template> --num_procs 5 --vpid_start 0 --
>> nodename <template> --universe apellegr_at_m45-037.pool:default-
>> universe-29473 --nsreplica "0.0.0;tcp://10.11.45.37:55477" --
>> gprreplica "0.0.0;tcp://10.11.45.37:55477"
>> [m45-037.pool:29473] pls:rsh: launching on node m45-037.pool
>> [m45-037.pool:29473] pls:rsh: m45-037.pool is a LOCAL node
>> [m45-037.pool:29473] pls:rsh: reset PATH: /n/poolfs/z/home/apellegr/
>> openmpi/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/
>> sbin:/bin:/usr/X11R6/bin:/n/poolfs/z/home/apellegr/openmpi/bin:/n/
>> poolfs/z/home/apellegr/openssl/bin
>> [m45-037.pool:29473] pls:rsh: reset LD_LIBRARY_PATH: /n/poolfs/z/
>> home/apellegr/openmpi/lib
>> [m45-037.pool:29473] pls:rsh: changing to directory /home/apellegr
>> [m45-037.pool:29473] pls:rsh: executing: (/n/poolfs/z/home/apellegr/
>> openmpi/bin/orted) orted --bootproxy 1 --name 0.0.1 --num_procs 5 --
>> vpid_start 0 --nodename m45-037.pool --universe
>> apellegr_at_m45-037.pool:default-universe-29473 --nsreplica
>> "0.0.0;tcp://10.11.45.37:55477" --gprreplica "0.0.0;tcp://
>> 10.11.45.37:55477" --set-sid [OAR_JOBID=597856 HOST=m45-037.pool
>> TERM=xterm SHELL=/bin/tcsh OAR_WORKING_DIRECTORY=/home/apellegr
>> SSH_CLIENT=10.11.0.4 50481 6667 OAR_USER=apellegr GROUP=csestudents
>> USER=apellegr SUDO_USER=oar OAR_WORKDIR=/home/apellegr
>> SUDO_UID=30143 HOSTTYPE=i486-linux USERNAME=apellegr OAR_JOB_NAME=
>> OAR_NODE_FILE=/var/lib/oar/597856 OAR_RESOURCE_PROPERTIES_FILE=/var/
>> lib/oar/597856_resources MAIL=/var/mail/oar PATH=/n/poolfs/z/home/
>> apellegr/openmpi/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/
>> bin:/sbin:/bin:/usr/X11R6/bin:/n/poolfs/z/home/apellegr/openmpi/
>> bin:/n/poolfs/z/home/apellegr/openssl/bin OAR_PROJECT_NAME=default
>> OAR_JOB_WALLTIME_SECONDS=7200 PWD=/home/apellegr HOME=/home/
>> apellegr SUDO_COMMAND=OAR SHLVL=2 OAR_FILE_NODES=/var/lib/oar/
>> 597856 OSTYPE=linux VENDOR=intel OAR_JOB_WALLTIME=2:0:0
>> MACHTYPE=i486 LOGNAME=apellegr OAR_NODEFILE=/var/lib/oar/597856
>> OAR_RESOURCE_FILE=/var/lib/oar/597856 SUDO_GID=390
>> OAR_JOB_ID=597856 OAR_O_WORKDIR=/home/apellegr _=/n/poolfs/z/home/
>> apellegr/openmpi/bin/mpirun OLDPWD=/home/apellegr/openmpi
>> OMPI_MCA_rds_hostfile_path=/var/lib/oar/597856
>> OMPI_MCA_pls_rsh_debug=1 OMPI_MCA_pls_rsh_agent=oarsh
>> LD_LIBRARY_PATH=/n/poolfs/z/home/apellegr/openmpi/lib
>> OMPI_MCA_seed=0]
>> [m45-037.pool:29473] pls:rsh: launching on node m45-038.pool
>> [m45-037.pool:29473] pls:rsh: m45-038.pool is a REMOTE node
>> [m45-037.pool:29473] pls:rsh: executing: (//usr/bin/oarsh) /usr/bin/
>> oarsh m45-038.pool set path = ( /n/poolfs/z/home/apellegr/openmpi/
>> bin $path ) ; if ( $?LD_LIBRARY_PATH == 1 ) set OMPI_have_llp ; if
>> ( $?LD_LIBRARY_PATH == 0 ) setenv LD_LIBRARY_PATH /n/poolfs/z/home/
>> apellegr/openmpi/lib ; if ( $?OMPI_have_llp == 1 ) setenv
>> LD_LIBRARY_PATH /n/poolfs/z/home/apellegr/openmpi/lib:
>> $LD_LIBRARY_PATH ; /n/poolfs/z/home/apellegr/openmpi/bin/orted --
>> bootproxy 1 --name 0.0.2 --num_procs 5 --vpid_start 0 --nodename
>> m45-038.pool --universe apellegr_at_m45-037.pool:default-
>> universe-29473 --nsreplica "0.0.0;tcp://10.11.45.37:55477" --
>> gprreplica "0.0.0;tcp://10.11.45.37:55477" [OAR_JOBID=597856
>> HOST=m45-037.pool TERM=xterm SHELL=/bin/tcsh OAR_WORKING_DIRECTORY=/
>> home/apellegr SSH_CLIENT=10.11.0.4 50481 6667 OAR_USER=apellegr
>> GROUP=csestudents USER=apellegr SUDO_USER=oar OAR_WORKDIR=/home/
>> apellegr SUDO_UID=30143 HOSTTYPE=i486-linux USERNAME=apellegr
>> OAR_JOB_NAME= OAR_NODE_FILE=/var/lib/oar/597856
>> OAR_RESOURCE_PROPERTIES_FILE=/var/lib/oar/597856_resources MAIL=/
>> var/mail/oar PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/
>> bin:/sbin:/bin:/usr/X11R6/bin:/n/poolfs/z/home/apellegr/openmpi/
>> bin:/n/poolfs/z/home/apellegr/openssl/bin OAR_PROJECT_NAME=default
>> OAR_JOB_WALLTIME_SECONDS=7200 PWD=/home/apellegr HOME=/home/
>> apellegr SUDO_COMMAND=OAR SHLVL=2 OAR_FILE_NODES=/var/lib/oar/
>> 597856 OSTYPE=linux VENDOR=intel OAR_JOB_WALLTIME=2:0:0
>> MACHTYPE=i486 LOGNAME=apellegr OAR_NODEFILE=/var/lib/oar/597856
>> OAR_RESOURCE_FILE=/var/lib/oar/597856 SUDO_GID=390
>> OAR_JOB_ID=597856 OAR_O_WORKDIR=/home/apellegr _=/n/poolfs/z/home/
>> apellegr/openmpi/bin/mpirun OLDPWD=/home/apellegr/openmpi
>> OMPI_MCA_rds_hostfile_path=/var/lib/oar/597856
>> OMPI_MCA_pls_rsh_debug=1 OMPI_MCA_pls_rsh_agent=oarsh
>> OMPI_MCA_seed=0]
>> bash: -c: line 0: syntax error near unexpected token `('
>> bash: -c: line 0: ` set path = ( /n/poolfs/z/home/apellegr/openmpi/
>> bin $path ) ; if ( $?LD_LIBRARY_PATH == 1 ) set OMPI_have_llp ; if
>> ( $?LD_LIBRARY_PATH == 0 ) setenv LD_LIBRARY_PATH /n/poolfs/z/home/
>> apellegr/openmpi/lib ; if ( $?OMPI_have_llp == 1 ) setenv
>> LD_LIBRARY_PATH /n/poolfs/z/home/apellegr/openmpi/lib:
>> $LD_LIBRARY_PATH ; /n/poolfs/z/home/apellegr/openmpi/bin/orted --
>> bootproxy 1 --name 0.0.2 --num_procs 5 --vpid_start 0 --nodename
>> m45-038.pool --universe apellegr_at_m45-037.pool:default-
>> universe-29473 --nsreplica "0.0.0;tcp://10.11.45.37:55477" --
>> gprreplica "0.0.0;tcp://10.11.45.37:55477"'
>> [m45-037.pool:29473] pls:rsh: launching on node m45-039.pool
>> [m45-037.pool:29473] ERROR: A daemon on node m45-038.pool failed to
>> start as expected.
>> [m45-037.pool:29473] ERROR: There may be more information available
>> from
>> [m45-037.pool:29473] ERROR: the remote shell (see above).
>> [m45-037.pool:29473] ERROR: The daemon exited unexpectedly with
>> status 2.
>> [m45-037.pool:29473] pls:rsh: m45-039.pool is a REMOTE node
>> [m45-037.pool:29473] pls:rsh: executing: (//usr/bin/oarsh) /usr/bin/
>> oarsh m45-039.pool set path = ( /n/poolfs/z/home/apellegr/openmpi/
>> bin $path ) ; if ( $?LD_LIBRARY_PATH == 1 ) set OMPI_have_llp ; if
>> ( $?LD_LIBRARY_PATH == 0 ) setenv LD_LIBRARY_PATH /n/poolfs/z/home/
>> apellegr/openmpi/lib ; if ( $?OMPI_have_llp == 1 ) setenv
>> LD_LIBRARY_PATH /n/poolfs/z/home/apellegr/openmpi/lib:
>> $LD_LIBRARY_PATH ; /n/poolfs/z/home/apellegr/openmpi/bin/orted --
>> bootproxy 1 --name 0.0.3 --num_procs 5 --vpid_start 0 --nodename
>> m45-039.pool --universe apellegr_at_m45-037.pool:default-
>> universe-29473 --nsreplica "0.0.0;tcp://10.11.45.37:55477" --
>> gprreplica "0.0.0;tcp://10.11.45.37:55477" [OAR_JOBID=597856
>> HOST=m45-037.pool TERM=xterm SHELL=/bin/tcsh OAR_WORKING_DIRECTORY=/
>> home/apellegr SSH_CLIENT=10.11.0.4 50481 6667 OAR_USER=apellegr
>> GROUP=csestudents USER=apellegr SUDO_USER=oar OAR_WORKDIR=/home/
>> apellegr SUDO_UID=30143 HOSTTYPE=i486-linux USERNAME=apellegr
>> OAR_JOB_NAME= OAR_NODE_FILE=/var/lib/oar/597856
>> OAR_RESOURCE_PROPERTIES_FILE=/var/lib/oar/597856_resources MAIL=/
>> var/mail/oar PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/
>> bin:/sbin:/bin:/usr/X11R6/bin:/n/poolfs/z/home/apellegr/openmpi/
>> bin:/n/poolfs/z/home/apellegr/openssl/bin OAR_PROJECT_NAME=default
>> OAR_JOB_WALLTIME_SECONDS=7200 PWD=/home/apellegr HOME=/home/
>> apellegr SUDO_COMMAND=OAR SHLVL=2 OAR_FILE_NODES=/var/lib/oar/
>> 597856 OSTYPE=linux VENDOR=intel OAR_JOB_WALLTIME=2:0:0
>> MACHTYPE=i486 LOGNAME=apellegr OAR_NODEFILE=/var/lib/oar/597856
>> OAR_RESOURCE_FILE=/var/lib/oar/597856 SUDO_GID=390
>> OAR_JOB_ID=597856 OAR_O_WORKDIR=/home/apellegr _=/n/poolfs/z/home/
>> apellegr/openmpi/bin/mpirun OLDPWD=/home/apellegr/openmpi
>> OMPI_MCA_rds_hostfile_path=/var/lib/oar/597856
>> OMPI_MCA_pls_rsh_debug=1 OMPI_MCA_pls_rsh_agent=oarsh
>> OMPI_MCA_seed=0]
>> bash: -c: line 0: syntax error near unexpected token `('
>> bash: -c: line 0: ` set path = ( /n/poolfs/z/home/apellegr/openmpi/
>> bin $path ) ; if ( $?LD_LIBRARY_PATH == 1 ) set OMPI_have_llp ; if
>> ( $?LD_LIBRARY_PATH == 0 ) setenv LD_LIBRARY_PATH /n/poolfs/z/home/
>> apellegr/openmpi/lib ; if ( $?OMPI_have_llp == 1 ) setenv
>> LD_LIBRARY_PATH /n/poolfs/z/home/apellegr/openmpi/lib:
>> $LD_LIBRARY_PATH ; /n/poolfs/z/home/apellegr/openmpi/bin/orted --
>> bootproxy 1 --name 0.0.3 --num_procs 5 --vpid_start 0 --nodename
>> m45-039.pool --universe apellegr_at_m45-037.pool:default-
>> universe-29473 --nsreplica "0.0.0;tcp://10.11.45.37:55477" --
>> gprreplica "0.0.0;tcp://10.11.45.37:55477"'
>> [m45-037.pool:29473] pls:rsh: launching on node m45-040.pool
>> [m45-037.pool:29473] ERROR: A daemon on node m45-039.pool failed to
>> start as expected.
>> [m45-037.pool:29473] ERROR: There may be more information available
>> from
>> [m45-037.pool:29473] ERROR: the remote shell (see above).
>> [m45-037.pool:29473] ERROR: The daemon exited unexpectedly with
>> status 2.
>> [m45-037.pool:29473] pls:rsh: m45-040.pool is a REMOTE node
>> [m45-037.pool:29473] pls:rsh: executing: (//usr/bin/oarsh) /usr/bin/
>> oarsh m45-040.pool set path = ( /n/poolfs/z/home/apellegr/openmpi/
>> bin $path ) ; if ( $?LD_LIBRARY_PATH == 1 ) set OMPI_have_llp ; if
>> ( $?LD_LIBRARY_PATH == 0 ) setenv LD_LIBRARY_PATH /n/poolfs/z/home/
>> apellegr/openmpi/lib ; if ( $?OMPI_have_llp == 1 ) setenv
>> LD_LIBRARY_PATH /n/poolfs/z/home/apellegr/openmpi/lib:
>> $LD_LIBRARY_PATH ; /n/poolfs/z/home/apellegr/openmpi/bin/orted --
>> bootproxy 1 --name 0.0.4 --num_procs 5 --vpid_start 0 --nodename
>> m45-040.pool --universe apellegr_at_m45-037.pool:default-
>> universe-29473 --nsreplica "0.0.0;tcp://10.11.45.37:55477" --
>> gprreplica "0.0.0;tcp://10.11.45.37:55477" [OAR_JOBID=597856
>> HOST=m45-037.pool TERM=xterm SHELL=/bin/tcsh OAR_WORKING_DIRECTORY=/
>> home/apellegr SSH_CLIENT=10.11.0.4 50481 6667 OAR_USER=apellegr
>> GROUP=csestudents USER=apellegr SUDO_USER=oar OAR_WORKDIR=/home/
>> apellegr SUDO_UID=30143 HOSTTYPE=i486-linux USERNAME=apellegr
>> OAR_JOB_NAME= OAR_NODE_FILE=/var/lib/oar/597856
>> OAR_RESOURCE_PROPERTIES_FILE=/var/lib/oar/597856_resources MAIL=/
>> var/mail/oar PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/
>> bin:/sbin:/bin:/usr/X11R6/bin:/n/poolfs/z/home/apellegr/openmpi/
>> bin:/n/poolfs/z/home/apellegr/openssl/bin OAR_PROJECT_NAME=default
>> OAR_JOB_WALLTIME_SECONDS=7200 PWD=/home/apellegr HOME=/home/
>> apellegr SUDO_COMMAND=OAR SHLVL=2 OAR_FILE_NODES=/var/lib/oar/
>> 597856 OSTYPE=linux VENDOR=intel OAR_JOB_WALLTIME=2:0:0
>> MACHTYPE=i486 LOGNAME=apellegr OAR_NODEFILE=/var/lib/oar/597856
>> OAR_RESOURCE_FILE=/var/lib/oar/597856 SUDO_GID=390
>> OAR_JOB_ID=597856 OAR_O_WORKDIR=/home/apellegr _=/n/poolfs/z/home/
>> apellegr/openmpi/bin/mpirun OLDPWD=/home/apellegr/openmpi
>> OMPI_MCA_rds_hostfile_path=/var/lib/oar/597856
>> OMPI_MCA_pls_rsh_debug=1 OMPI_MCA_pls_rsh_agent=oarsh
>> OMPI_MCA_seed=0]
>> bash: -c: line 0: syntax error near unexpected token `('
>> bash: -c: line 0: ` set path = ( /n/poolfs/z/home/apellegr/openmpi/
>> bin $path ) ; if ( $?LD_LIBRARY_PATH == 1 ) set OMPI_have_llp ; if
>> ( $?LD_LIBRARY_PATH == 0 ) setenv LD_LIBRARY_PATH /n/poolfs/z/home/
>> apellegr/openmpi/lib ; if ( $?OMPI_have_llp == 1 ) setenv
>> LD_LIBRARY_PATH /n/poolfs/z/home/apellegr/openmpi/lib:
>> $LD_LIBRARY_PATH ; /n/poolfs/z/home/apellegr/openmpi/bin/orted --
>> bootproxy 1 --name 0.0.4 --num_procs 5 --vpid_start 0 --nodename
>> m45-040.pool --universe apellegr_at_m45-037.pool:default-
>> universe-29473 --nsreplica "0.0.0;tcp://10.11.45.37:55477" --
>> gprreplica "0.0.0;tcp://10.11.45.37:55477"'
>> [m45-037.pool:29473] ERROR: A daemon on node m45-040.pool failed to
>> start as expected.
>> [m45-037.pool:29473] ERROR: There may be more information available
>> from
>> [m45-037.pool:29473] ERROR: the remote shell (see above).
>> [m45-037.pool:29473] ERROR: The daemon exited unexpectedly with
>> status 2.
>> [m45-037.pool:29473] [0,0,0] ORTE_ERROR_LOG: Timeout in
>> file ../../../../orte/mca/pls/base/pls_base_orted_cmds.c at line 188
>> [m45-037.pool:29473] [0,0,0] ORTE_ERROR_LOG: Timeout in
>> file ../../../../../orte/mca/pls/rsh/pls_rsh_module.c at line 1190
>> --------------------------------------------------------------------------
>> mpirun was unable to cleanly terminate the daemons for this job.
>> Returned value Timeout instead of ORTE_SUCCESS.
>> --------------------------------------------------------------------------
>> apellegr_at_m45-037:~$
>>
>> Can anybody help me?
>> Thanks,
>> ~Andrea
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users



  • application/pkcs7-signature attachment: smime.p7s