Hah! Your reply came in seconds after I replied.
Your questions made me notice that we're missing a FAQ entry for the
"ssh:rsh" explanation, though, so I'll add an entry for that. Thanks.
On May 18, 2007, at 5:15 PM, Steven Truong wrote:
> Hi, Jeff. Ok. After reading through the FAQ, I modified .bashrc to
> set PATH and LD_LIBRARY_PATH and now I could execute:
> [struong_at_neptune ~]$ ssh node07 which orted /usr/local/
> [struong_at_neptune ~]$ /usr/local/openmpi-1.2.1/bin/mpirun --host node07
> hostname node07.nanostellar.com
> Thank you.
> On 5/18/07, Steven Truong <midair77_at_[hidden]> wrote:
>> Hi, Jeff. Thanks so very much for all your helps so far. I decided
>> that I needed to go back and check whether openmpi even works for
>> simple cases, so here I am.
>> So my shell might have exited when it detect that I ran
>> non-interactively. But then again, how this parameter
>> MCA pls: parameter "pls_rsh_agent" (current value: "ssh :rsh")
>> affect my outcome? How am I going to set PATH and LD_LIBRARY_PATH to
>> be like those in .bash_profile in my Torque job files?
>> Could you give me some tips here?
>> Below is my current bash shell's settings.
>> [struong_at_neptune ~]$ echo $SHELL
>> [struong_at_neptune ~]$ cat .bash_profile | grep -v ^#
>> if [ -f ~/.bashrc ]; then
>> . ~/.bashrc
>> umask 027
>> F77_GETARGDECL=" "
>> source /usr/local/ecce/scripts/runtime_setup.sh
>> export F77 USERNAME BASH_ENV PATH RSHCOMMAND FC F90 PBS_DEFAULT
>> BUILD_DIR INSTALL_DIR LD_LIBRARY_PATH
>> [struong_at_neptune ~]$ ssh node07 which orted
>> which: no orted in (/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin)
>> [struong_at_neptune ~]$ /usr/local/openmpi-1.2.1/bin/mpirun --host
>> node07 hostname
>> Failed to find the following executable:
>> Host: node07.nanostellar.com
>> Executable: node07
>> Cannot continue.
>> On 5/18/07, Jeff Squyres <jsquyres_at_[hidden]> wrote:
>>> On May 18, 2007, at 4:38 PM, Steven Truong wrote:
>>>> [struong_at_neptune 4cpu4npar10nsim]$ mpirun --mca btl tcp,self -np 1
>>>> --host node07 hostname
>>>> bash: orted: command not found
>>> As you noted later in your mail, this is the key problem: orted is
>>> not found on the remote node.
>>> Notice that you are currently using the rsh launcher, not the Torque
>>> launcher (presumably because you are not inside a Torque job). What
>>> you want to check is:
>>> rsh node07 which orted
>>> (or use ssh -- whatever is correct for your cluster)
>>> I suspect that orted will not be found, and that you'll need to
>>> modify your shell startup files to set PATH / LD_LIBRARY_PATH
>>> properly. Note that some shell startup files will exit early if
>>> detect that they are running on a non-interactive login. See
>>> www.open-mpi.org/faq/?category=running#adding-ompi-to-path for more
>>> Alternatively, you can simply use the absolute pathname to mpirun,
>>> which Open MPI will interpret to mean that you want OMPI to set the
>>> PATH/LD_LIBRARY_PATH on the remote node for you. Something like
>>> /usr/local/openmpi-1.2.1/bin/mpirun --host node07 hostname
>>> (note that the "btl" MCA parameter is only relevant for MPI
>>> Jeff Squyres
>>> Cisco Systems
>>> users mailing list
> users mailing list