Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Setting up Open MPI to run on multiple servers
From: Gus Correa (gus_at_[hidden])
Date: 2008-08-12 12:38:33


Hi Rayne Lance and list

I second Lenny's suggestion.
The easiest way to get started is to use full paths to mpiexec when you
run your program,
and to the mpi compiler wrappers (mpicc, etc) when you compile it.
If you don't use the mpi compiler wrappers, make sure your Makefile
points to the
correct MPI libraries and include files (use full paths again).

Linux distributions, commercial compilers, software packages, often come
with
their own MPI libraries, which are often installed in places that have
high precedence on your path.
This can produce very confusing effects at compile and at run time, and
all sorts of mixup
(e.g. inadvertently running with LAM mpirun a program that was
also inadvertently compiled with mpicc from mpich2).

There are other ways to solve this problem (e.g. environment modules
package,
fiddling with your .bashrc/.cshrc file, etc).
However, to get started, using full paths is perfectly acceptable
and gives the most pleasure with the least suffering.

On the long term, you may consider installing a batch job and resource
manager package.
I use Torque/PBS, which is free, great, and easily available (e.g.
through yum).
It may well be already installed on your computers.
Check it out with "which pbs_server" or "locate torque".
You can compile your OpenMPI with Torque/PBS support, and thereafter use
Torque/PBS submit your jobs.
Torque will take care of which processors/nodes are available (no
hostfile required),
send the jobs to the correct ones, queue up as many jobs as you want,
etc, etc.
You can comfortably monitor your jobs through a fancy pair of GUIs
(xpbs, xpbsmon).
Torque/PBS makes life much easier, after simple configuration and a
short learning curve.
It is well documented, and has active support from Cluster Resources,
and a mailing list for extra help.

I hope this helps,
Gus Correa

-- 
---------------------------------------------------------------------
Gustavo J. Ponce Correa, PhD - Email: gus_at_[hidden]
Lamont-Doherty Earth Observatory - Columbia University
P.O. Box 1000 [61 Route 9W] - Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------
Lenny Verkhovsky wrote:
> you can also provide a full path to your mpi
>
> #/usr/lib/openmpi/1.2.5-gcc/bin/mpiexec -n 2 ./a.out
>
> On 8/12/08, *jody* <jody.xha_at_[hidden] <mailto:jody.xha_at_[hidden]>> 
> wrote:
>
>     No.
>     The PATH variable simply tells the system in which order the
>     directories should be searched for executables.
>
>     so in .bash_profile just add the line
>       PATH=/usr/lib/openmpi/1.2.5-gcc/bin:$PATH
>     after the line
>       PATH=$PATH:$HOME/bin
>
>     Then the system will search in /usr/lib/openmpi/1.2.5-gcc/bin before
>     it will look
>     in the directories it would have looked in anyway.
>
>
>
>     Jody
>
>
>     On Tue, Aug 12, 2008 at 11:59 AM, Rayne <lancer6238_at_[hidden]
>     <mailto:lancer6238_at_[hidden]>> wrote:
>     > My .bash_profile and .bashrc on the server are exactly the same
>     as that on my PC. However, I can run mpiexec without any problems
>     just using my PC as a single node, i.e. without trying to login to
>     other servers and using multiple nodes. I only get the errors on
>     the server.
>     >
>     > In .bash_profile, I see
>     >
>     > PATH=$PATH:$HOME/bin
>     >
>     > If I change this, won't it affect other programs as well?
>     >
>     > Thank you.
>     >
>     > Regards,
>     > Rayne
>     >
>     > --- On Tue, 12/8/08, jody <jody.xha_at_[hidden]
>     <mailto:jody.xha_at_[hidden]>> wrote:
>     >
>     >> From: jody <jody.xha_at_[hidden] <mailto:jody.xha_at_[hidden]>>
>     >> Subject: Re: [OMPI users] Setting up Open MPI to run on
>     multiple servers
>     >> To: lancer6238_at_[hidden] <mailto:lancer6238_at_[hidden]>, "Open
>     MPI Users" <users_at_[hidden] <mailto:users_at_[hidden]>>
>     >> Date: Tuesday, 12 August, 2008, 5:23 PM
>     >> What are the contents of your $PATH environment variable?
>     >> Make sure that your Open-MPI folder
>     >> (/usr/lib/openmpi/1.2.5-gcc/bin)
>     >> precedes '/usr/bin' in $PATH,
>     >> i.e.
>     >> /usr/lib/openmpi/1.2.5-gcc/bin:/usr/bin
>     >>
>     >> then the Open-MPI version of mpirun or mpiexec will be used
>     >> instead of
>     >> the LAM-versions.
>     >>
>     >> This should also be the case on your other machines.
>     >>
>     >> BTW, since it seems you haven't correctly set your PATH
>     >> variable, i
>     >> suspect you have omitted
>     >> to set LD_LIBRARY_PATH as well...
>     >> see points 1,2 and 3 in
>     >> http://www.open-mpi.org/faq/?category=running
>     >>
>     >> Jody
>     >>
>     >> On Tue, Aug 12, 2008 at 11:10 AM, Rayne
>     >> <lancer6238_at_[hidden] <mailto:lancer6238_at_[hidden]>> wrote:
>     >> > Hi,
>     >> >
>     >> > I looked for any folders with 'lam', and found
>     >> 2, under /usr/lib/lam and /etc/lam. I don't know if it
>     >> means LAM was previously installed, because my PC also has
>     >> /usr/lib/lam, although the contents are different. I renamed
>     >> the 2 folders, and got the "*** Oops -- I cannot open
>     >> the LAM help file." error below instead.
>     >> >
>     >> > I tried 'whichexec', and it gave me
>     >> /usr/bin/mpiexec. I checked the mpiexec there and it's
>     >> actually a Perl script, and I believe I installed OpenMPI in
>     >> /usr/lib64/openmpi/1.2.5-gcc/
>     >> >
>     >> > So I tried mpirun instead and it gave me the following
>     >> message:
>     >> >
>     >> > "*** Oops -- I cannot open the LAM help file.
>     >> > *** I tried looking for it in the following places:
>     >> > ***
>     >> > ***   $HOME/lam-helpfile
>     >> > ***   $HOME/lam-7.0.6-helpfile
>     >> > ***   $HOME/etc/lam-helpfile
>     >> > ***   $HOME/etc/lam- 7.0.6-helpfile
>     >> > ***   $LAMHELPDIR/lam-helpfile
>     >> > ***   $LAMHELPDIR/lam-7.0.6-helpfile
>     >> > ***   $LAMHOME/etc/lam-helpfile
>     >> > ***   $LAMHOME/etc/lam-7.0.6-helpfile
>     >> > ***   $SYSCONFDIR/lam-helpfile
>     >> > ***   $SYSCONFDIR/lam- 7.0.6-helpfile
>     >> > ***
>     >> > *** You were supposed to get help on the program
>     >> "MPI"
>     >> > *** about the topic "no-lamd"
>     >> > ***
>     >> > *** Sorry!"
>     >> >
>     >> > Firstly, how do I change the settings such that
>     >> mpiexec points to the mpiexec in my installation folder,
>     >> which I believe should be
>     >> > /usr/lib/openmpi/1.2.5-gcc/bin/mpiexec, and the
>     >> mpiexec there seems to be a shortcut that points to
>     >> /usr/lib/openmpi/1.2.5-gcc/bin/orterun. Would this help?
>     >> While I'm at it, it seems that mpirun, which is
>     >> /usr/bin/mpirun currently, should also point to
>     >> /usr/lib/openmpi/1.2.5-gcc/bin/mpirun, which also is a
>     >> shortcut to /usr/lib/openmpi/1.2.5-gcc/bin/orterun.
>     >> >
>     >> > Thank you.
>     >> >
>     >> > Regards,
>     >> > Rayne
>     >> >
>     >> > --- On Tue, 12/8/08, jody <jody.xha_at_[hidden]
>     <mailto:jody.xha_at_[hidden]>>
>     >> wrote:
>     >> >
>     >> >> From: jody <jody.xha_at_[hidden] <mailto:jody.xha_at_[hidden]>>
>     >> >> Subject: Re: [OMPI users] Setting up Open MPI to
>     >> run on multiple servers
>     >> >> To: lancer6238_at_[hidden] <mailto:lancer6238_at_[hidden]>,
>     "Open MPI
>     >> Users" <users_at_[hidden] <mailto:users_at_[hidden]>>
>     >> >> Date: Tuesday, 12 August, 2008, 3:38 PM
>     >> >> Hi Ryan
>     >> >> Another thing:
>     >> >> Have you checked if the mpiexec you call is really
>     >> the one
>     >> >> from your
>     >> >> Open-MPI installation?
>     >> >>
>     >> >> Try 'which mpiexec' to find out.
>     >> >>
>     >> >> Jody
>     >> >>
>     >> >> On Tue, Aug 12, 2008 at 9:36 AM, jody
>     >> >> <jody.xha_at_[hidden] <mailto:jody.xha_at_[hidden]>> wrote:
>     >> >> > Hi Ryan
>     >> >> >
>     >> >> > The message "Lamnodes Failed!"
>     >> seems to
>     >> >> indicate that you still have a
>     >> >> > LAM/MPI installation somewhere.
>     >> >> > You should get rid of that first.
>     >> >> >
>     >> >> > Jody
>     >> >> >
>     >> >> > On Tue, Aug 12, 2008 at 9:00 AM, Rayne
>     >> >> <lancer6238_at_[hidden] <mailto:lancer6238_at_[hidden]>> wrote:
>     >> >> >> Hi, thanks for your reply.
>     >> >> >>
>     >> >> >> I did what you said, set up the
>     >> password-less ssh,
>     >> >> nfs etc, and put the IP address of the server in
>     >> the default
>     >> >> hostfile (in my PC only, the default hostfile in
>     >> the server
>     >> >> does not contain any IP addresses). Then I
>     >> installed Open
>     >> >> MPI in the server under the same directory as my
>     >> PC, e.g.
>     >> >> /usr/lib/openmpi/1.2.5-gcc/
>     >> >> >> All my MPI programs and executables, e.g.
>     >> a.out
>     >> >> are in the shared folder. However, I have trouble
>     >> running
>     >> >> the MPI programs.
>     >> >> >>
>     >> >> >> After compiling my MPI program on my PC,
>     >> I tried
>     >> >> to run it via "mpiexec -n 2 ./a.out".
>     >> However, I
>     >> >> get the error message
>     >> >> >>
>     >> >> >> "Failed to find or execute the
>     >> following
>     >> >> executable:
>     >> >> >> Host: (the name of the server)
>     >> >> >> Executable: ./a.out
>     >> >> >>
>     >> >> >> Cannot continue"
>     >> >> >>
>     >> >> >> Then when I tried to run the MPI program
>     >> on my
>     >> >> server after compiling, I get the error:
>     >> >> >>
>     >> >> >> "Lamnodes Failed!
>     >> >> >> Check if you had booted lam before
>     >> calling mpiexec
>     >> >> else use -machinefile to pass host file to
>     >> mpiexec"
>     >> >> >>
>     >> >> >> I'm guessing that because the server
>     >> cannot
>     >> >> run the MPI program, I can't run the program
>     >> on my PC as
>     >> >> well. Is there some other configurations I missed
>     >> when using
>     >> >> Open MPI on my server?
>     >> >> >>
>     >> >> >> Thank you.
>     >> >> >>
>     >> >> >> Regards,
>     >> >> >> Rayne
>     >> >
>     >> >
>     >> >      Yahoo! Toolbar is now powered with Search
>     >> Assist.Download it now!
>     >> > http://sg.toolbar.yahoo.com/
>     >> >
>     >> > _______________________________________________
>     >> > users mailing list
>     >> > users_at_[hidden] <mailto:users_at_[hidden]>
>     >> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>     >> >
>     >
>     >
>     >      New Email names for you!
>     > Get the Email name you've always wanted on the new @ymail and
>     @rocketmail.
>     > Hurry before someone else does!
>     > http://mail.promotions.yahoo.com/newdomains/sg/
>     >
>     > _______________________________________________
>     > users mailing list
>     > users_at_[hidden] <mailto:users_at_[hidden]>
>     > http://www.open-mpi.org/mailman/listinfo.cgi/users
>     >
>     _______________________________________________
>     users mailing list
>     users_at_[hidden] <mailto:users_at_[hidden]>
>     http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>------------------------------------------------------------------------
>
>_______________________________________________
>users mailing list
>users_at_[hidden]
>http://www.open-mpi.org/mailman/listinfo.cgi/users
>