Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] mpirun is in the PATH, but "orted: command not found"
From: giggzounet (giggzounet_at_[hidden])
Date: 2012-03-26 04:08:01


Hi,

My problem:
On our cluster, openmpi 1.4.4 is installed. We are using the module
environment so I have created a module file to set up openmpi:
prepend-path PATH /appl/mpi/openmpi/1.4.4/bin
prepend-path LD_LIBRARY_PATH /appl/mpi/openmpi/1.4.4/lib
prepend-path MANPATH /appl/mpi/openmpi/1.4.4/share/man
setenv MPI_BIN /appl/mpi/openmpi/1.4.4/bin
setenv MPI_SYSCONFIG /appl/mpi/openmpi/1.4.4/etc
setenv MPI_INCLUDE /appl/mpi/openmpi/1.4.4/include
setenv MPI_LIB /appl/mpi/openmpi/1.4.4/lib
setenv MPI_MAN /appl/mpi/openmpi/1.4.4/share/man
setenv MPI_COMPILER openmpi-x86_64
setenv MPI_SUFFIX _openmpi
setenv MPI_HOME /appl/mpi/openmpi/1.4.4

This openmpi module loads without problem and mpirun, orted...are in the
PATH.
Now I want to start a pbs job:
#!/bin/bash
#PBS -N mpi-test
#PBS -j oe
#PBS -m abe
#PBS -l nodes=2:ppn=2
#PBS -l walltime=2:00:00
#PBS -q long
module list
module unload mpi/intel-mpi/2012
module load mpi/openmpi/1.4.4
module list
cd $PBS_O_WORKDIR
cat $PBS_NODEFILE > hosts_openmpi
mpirun -n $NUMPROCS -machinefile ./hosts_openmpi mpitests-IMB-MPI1

And I get:
bash: orted: command not found
--------------------------------------------------------------------------
A daemon (pid 7399) died unexpectedly with status 127 while attempting
to launch so we are aborting.

There may be more information reported by the environment (see above).

This may be because the daemon was unable to find all the needed shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
location of the shared libraries on the remote nodes and this will
automatically be forwarded to the remote nodes.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that the job aborted, but has no info as to the process
that caused that situation.
--------------------------------------------------------------------------
mpirun: clean termination accomplished

It is very strange.../appl/mpi/openmpi/1.4.4/bin/ is in the PATH IN the pbs
environment (I check that with env in a pbs job). But it doesn't work...

/appl/mpi/openmpi/1.4.4/bin/mpirun -n $NUMPROCS -machinefile
./hosts_openmpi /appl/mpi/openmpi/1.4.4/bin/mpitests-IMB-MPI1 runs without
problem.

So I don't understand where I did an error...If someone could help me...
Thx a lot,
Best regards,
Guillaume