Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Elementary question on openMPI application location when using PBS submission
From: Joshua Hursey (jjhursey_at_[hidden])
Date: 2009-12-02 08:49:29


The --preload-* options to 'mpirun' currently use the ssh/scp commands (or rsh/rcp via an MCA parameter) to move files from the machine local to the 'mpirun' command to the compute nodes during launch. This assumes that you have Open MPI already installed on all of the machines. It was an option targeted to users that do not wish to have an NFS or similar mount on all machines.

Torque/PBS may be faster at this depending on how they organize the staging, but I assume that we are essentially doing the same thing. There was a post on the users list a little while back discussing these options a bit more fully.

-- Josh

On Dec 1, 2009, at 3:21 PM, Belaid MOA wrote:

> I saw those options before but somehow I did not pay attention to them :(.
> I was thinking that the copying is done automatically, so I felt the options were useless but I was wrong.
> Thanks a lot Gus; that's exactly what I was looking for. I will try them then.
>
> Best Regards.
> ~Belaid.
>
> > Date: Tue, 1 Dec 2009 15:14:01 -0500
> > From: gus_at_[hidden]
> > To: users_at_[hidden]
> > Subject: Re: [OMPI users] Elementary question on openMPI application location when using PBS submission
> >
> > Hi Belaid Moa
> >
> > I spoke too fast, and burnt my tongue.
> > I should have double checked before speaking out.
> > I just looked up "man mpiexec" and found the options below.
> > I never used or knew about them, but you may want to try.
> > They seem to be similar to the Torque/PBS stage_in feature.
> > I would guess they use scp to copy the executable and other
> > files to the nodes, but I don't really know which copying
> > mechanism is used.
> >
> > Gus Correa
> > ---------------------------------------------------------------------
> > Gustavo Correa
> > Lamont-Doherty Earth Observatory - Columbia University
> > Palisades, NY, 10964-8000 - USA
> > ---------------------------------------------------------------------
> >
> > #############################################
> > Excerpt from (OpenMPI 1.3.2) "man mpiexec":
> > #############################################
> >
> > --preload-binary
> > Copy the specified executable(s) to remote machines
> > prior to
> > starting remote processes. The executables will be
> > copied to
> > the Open MPI session directory and will be deleted
> > upon com-
> > pletion of the job.
> >
> > --preload-files <files>
> > Preload the comma separated list of files to the
> > current
> > working directory of the remote machines where
> > processes will
> > be launched prior to starting those processes.
> >
> > --preload-files-dest-dir <path>
> > The destination directory to be used for
> > preload-files, if
> > other than the current working directory. By
> > default, the
> > absolute and relative paths provided by
> > --preload-files are
> > used.
> >
> >
> > ################################
> >
> > Gus Correa wrote:
> > > Hi Belaid Moa
> > >
> > > Belaid MOA wrote:
> > >> Thank you very very much Gus. Does this mean that OpenMPI does not
> > >> copy the executable from the master node to the worker nodes?
> > >
> > > Not that I know.
> > > Making the executable available on the nodes, and any
> > > input files the program may need, is the user's responsibility,
> > > not of mpiexec.
> > >
> > > On the other hand,
> > > Torque/PBS has a "stage_in/stage_out" feature that is supposed to
> > > copy files over to the nodes, if you want to give it a shot.
> > > See "man qsub" and look into the (numerous) "-W" option under
> > > the "stage[in,out]=file_list" sub-options.
> > > This is a relic from the old days where everything had to be on
> > > local disks on the nodes, and NFS ran over Ethernet 10/100,
> > > but it is still used by people that
> > > run MPI programs with heavy I/O, to avoid pounding on NFS or
> > > even on parallel file systems.
> > > I tried the stage_in/out feature a loooong time ago,
> > > (old PBS before Torque), but it had issues.
> > > It probably works now with the newer/better
> > > versions of Torque.
> > >
> > > However, the easy way to get this right is just to use an NFS mounted
> > > directory.
> > >
> > >> If that's case, I will go ahead and NFS mount my working directory.
> > >>
> > >
> > > This would make your life much easier.
> > >
> > > My $0.02.
> > > Gus Correa
> > > ---------------------------------------------------------------------
> > > Gustavo Correa
> > > Lamont-Doherty Earth Observatory - Columbia University
> > > Palisades, NY, 10964-8000 - USA
> > > ---------------------------------------------------------------------
> > >
> > >
> > >
> > >
> > >> ~Belaid.
> > >>
> > >>
> > >> > Date: Tue, 1 Dec 2009 13:50:57 -0500
> > >> > From: gus_at_[hidden]
> > >> > To: users_at_[hidden]
> > >> > Subject: Re: [OMPI users] Elementary question on openMPI
> > >> application location when using PBS submission
> > >> >
> > >> > Hi Belaid MOA
> > >> >
> > >> > See this FAQ:
> > >> >
> > >> http://www.open-mpi.org/faq/?category=running#do-i-need-a-common-filesystem
> > >>
> > >> > http://www.open-mpi.org/faq/?category=building#where-to-install
> > >> > http://www.open-mpi.org/faq/?category=tm#tm-obtain-host
> > >> >
> > >> > Your executable needs to be on a directory that is accessible
> > >> > by all nodes in your node pool.
> > >> > An easy way to achieve this is to put it in a directory that
> > >> > is NFS mounted on all nodes, and launch your pbs script from there.
> > >> >
> > >> > A less convenient alternative, if no NFS directory is available,
> > >> > is to copy the executable over to the nodes.
> > >> >
> > >> > I also find it easier to write a PBS script instead of putting
> > >> > all the PBS directives in the command line.
> > >> > In this case you can put the lines below in your PBS script,
> > >> > to ensure all nodes will be on your work directory (cd
> > >> $PBS_O_WORKDIR):
> > >> >
> > >> > ########
> > >> >
> > >> > #PBS ... (PBS directives)
> > >> > ...
> > >> > cd $PBS_O_WORKDIR
> > >> > mpiexec -n ....
> > >> >
> > >> > ########
> > >> >
> > >> > IIRR, by default Torque/PBS puts you in your home directory on
> > >> > the nodes, which may or may not be the location of your executable.
> > >> >
> > >> > I hope this helps,
> > >> > Gus Correa
> > >> > ---------------------------------------------------------------------
> > >> > Gustavo Correa
> > >> > Lamont-Doherty Earth Observatory - Columbia University
> > >> > Palisades, NY, 10964-8000 - USA
> > >> > ---------------------------------------------------------------------
> > >> >
> > >> > Belaid MOA wrote:
> > >> > > Hello everyone,
> > >> > > I am new to this list and I have a very elementary question:
> > >> suppose we
> > >> > > have three machines, HN (Head Node hosting the pbs server), WN1 (A
> > >> > > worker node) and WN (another worker node). The PBS nodefile has
> > >> WN1 and
> > >> > > WN2 in it (DOES NOT HAVE HN).
> > >> > > My openMPI program (hello) and PBS script(my_script.sh) reside on
> > >> the
> > >> > > HN. When I submit my PBS script using qsub -l nodes=2
> > >> my_script.sh, I
> > >> > > get the following error:
> > >> > >
> > >> > >
> > >> --------------------------------------------------------------------------
> > >>
> > >> > > mpirun was unable to launch the specified application as it could
> > >> not
> > >> > > find an executable:
> > >> > >
> > >> > > Executable: hello
> > >> > > Node: WN2
> > >> > >
> > >> > > while attempting to start process rank 0.
> > >> > >
> > >> --------------------------------------------------------------------------
> > >>
> > >> > >
> > >> > > How come my hello program is not copied automatically to the worker
> > >> > > nodes? This leads to my elementary question:
> > >> > > where the application should be when using a PBS submission?
> > >> > >
> > >> > > Note that when I run mpirun from HN with machinefile containing
> > >> WN1 and
> > >> > > WN2, I get the right output.
> > >> > >
> > >> > > Any help on this is very appreciated.
> > >> > >
> > >> > > ~Belaid.
> > >> > >
> > >> > >
> > >> > >
> > >> ------------------------------------------------------------------------
> > >> > > Windows Live: Keep your friends up to date with what you do online.
> > >> > > <http://go.microsoft.com/?linkid=9691810>
> > >> > >
> > >> > >
> > >> > >
> > >> ------------------------------------------------------------------------
> > >> > >
> > >> > > _______________________________________________
> > >> > > users mailing list
> > >> > > users_at_[hidden]
> > >> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >> >
> > >> > _______________________________________________
> > >> > users mailing list
> > >> > users_at_[hidden]
> > >> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >>
> > >> ------------------------------------------------------------------------
> > >> Windows Live: Make it easier for your friends to see what you’re up to
> > >> on Facebook. <http://go.microsoft.com/?linkid=9691811>
> > >>
> > >>
> > >> ------------------------------------------------------------------------
> > >>
> > >> _______________________________________________
> > >> users mailing list
> > >> users_at_[hidden]
> > >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >
> > > _______________________________________________
> > > users mailing list
> > > users_at_[hidden]
> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> Get a great deal on Windows 7 and see how it works the way you want. See the Windows 7 offers now._______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users