Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Elementary question on openMPI application location when using PBS submission
From: Gus Correa (gus_at_[hidden])
Date: 2009-12-01 15:14:01


Hi Belaid Moa

I spoke too fast, and burnt my tongue.
I should have double checked before speaking out.
I just looked up "man mpiexec" and found the options below.
I never used or knew about them, but you may want to try.
They seem to be similar to the Torque/PBS stage_in feature.
I would guess they use scp to copy the executable and other
files to the nodes, but I don't really know which copying
mechanism is used.

Gus Correa
---------------------------------------------------------------------
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------

#############################################
Excerpt from (OpenMPI 1.3.2) "man mpiexec":
#############################################

        --preload-binary
                  Copy the specified executable(s) to remote machines
prior to
                  starting remote processes. The executables will be
copied to
                  the Open MPI session directory and will be deleted
upon com-
                  pletion of the job.

        --preload-files <files>
                  Preload the comma separated list of files to the
current
                  working directory of the remote machines where
processes will
                  be launched prior to starting those processes.

        --preload-files-dest-dir <path>
                  The destination directory to be used for
preload-files, if
                  other than the current working directory. By
default, the
                  absolute and relative paths provided by
--preload-files are
                  used.

################################

Gus Correa wrote:
> Hi Belaid Moa
>
> Belaid MOA wrote:
>> Thank you very very much Gus. Does this mean that OpenMPI does not
>> copy the executable from the master node to the worker nodes?
>
> Not that I know.
> Making the executable available on the nodes, and any
> input files the program may need, is the user's responsibility,
> not of mpiexec.
>
> On the other hand,
> Torque/PBS has a "stage_in/stage_out" feature that is supposed to
> copy files over to the nodes, if you want to give it a shot.
> See "man qsub" and look into the (numerous) "-W" option under
> the "stage[in,out]=file_list" sub-options.
> This is a relic from the old days where everything had to be on
> local disks on the nodes, and NFS ran over Ethernet 10/100,
> but it is still used by people that
> run MPI programs with heavy I/O, to avoid pounding on NFS or
> even on parallel file systems.
> I tried the stage_in/out feature a loooong time ago,
> (old PBS before Torque), but it had issues.
> It probably works now with the newer/better
> versions of Torque.
>
> However, the easy way to get this right is just to use an NFS mounted
> directory.
>
>> If that's case, I will go ahead and NFS mount my working directory.
>>
>
> This would make your life much easier.
>
> My $0.02.
> Gus Correa
> ---------------------------------------------------------------------
> Gustavo Correa
> Lamont-Doherty Earth Observatory - Columbia University
> Palisades, NY, 10964-8000 - USA
> ---------------------------------------------------------------------
>
>
>
>
>> ~Belaid.
>>
>>
>> > Date: Tue, 1 Dec 2009 13:50:57 -0500
>> > From: gus_at_[hidden]
>> > To: users_at_[hidden]
>> > Subject: Re: [OMPI users] Elementary question on openMPI
>> application location when using PBS submission
>> >
>> > Hi Belaid MOA
>> >
>> > See this FAQ:
>> >
>> http://www.open-mpi.org/faq/?category=running#do-i-need-a-common-filesystem
>>
>> > http://www.open-mpi.org/faq/?category=building#where-to-install
>> > http://www.open-mpi.org/faq/?category=tm#tm-obtain-host
>> >
>> > Your executable needs to be on a directory that is accessible
>> > by all nodes in your node pool.
>> > An easy way to achieve this is to put it in a directory that
>> > is NFS mounted on all nodes, and launch your pbs script from there.
>> >
>> > A less convenient alternative, if no NFS directory is available,
>> > is to copy the executable over to the nodes.
>> >
>> > I also find it easier to write a PBS script instead of putting
>> > all the PBS directives in the command line.
>> > In this case you can put the lines below in your PBS script,
>> > to ensure all nodes will be on your work directory (cd
>> $PBS_O_WORKDIR):
>> >
>> > ########
>> >
>> > #PBS ... (PBS directives)
>> > ...
>> > cd $PBS_O_WORKDIR
>> > mpiexec -n ....
>> >
>> > ########
>> >
>> > IIRR, by default Torque/PBS puts you in your home directory on
>> > the nodes, which may or may not be the location of your executable.
>> >
>> > I hope this helps,
>> > Gus Correa
>> > ---------------------------------------------------------------------
>> > Gustavo Correa
>> > Lamont-Doherty Earth Observatory - Columbia University
>> > Palisades, NY, 10964-8000 - USA
>> > ---------------------------------------------------------------------
>> >
>> > Belaid MOA wrote:
>> > > Hello everyone,
>> > > I am new to this list and I have a very elementary question:
>> suppose we
>> > > have three machines, HN (Head Node hosting the pbs server), WN1 (A
>> > > worker node) and WN (another worker node). The PBS nodefile has
>> WN1 and
>> > > WN2 in it (DOES NOT HAVE HN).
>> > > My openMPI program (hello) and PBS script(my_script.sh) reside on
>> the
>> > > HN. When I submit my PBS script using qsub -l nodes=2
>> my_script.sh, I
>> > > get the following error:
>> > >
>> > >
>> --------------------------------------------------------------------------
>>
>> > > mpirun was unable to launch the specified application as it could
>> not
>> > > find an executable:
>> > >
>> > > Executable: hello
>> > > Node: WN2
>> > >
>> > > while attempting to start process rank 0.
>> > >
>> --------------------------------------------------------------------------
>>
>> > >
>> > > How come my hello program is not copied automatically to the worker
>> > > nodes? This leads to my elementary question:
>> > > where the application should be when using a PBS submission?
>> > >
>> > > Note that when I run mpirun from HN with machinefile containing
>> WN1 and
>> > > WN2, I get the right output.
>> > >
>> > > Any help on this is very appreciated.
>> > >
>> > > ~Belaid.
>> > >
>> > >
>> > >
>> ------------------------------------------------------------------------
>> > > Windows Live: Keep your friends up to date with what you do online.
>> > > <http://go.microsoft.com/?linkid=9691810>
>> > >
>> > >
>> > >
>> ------------------------------------------------------------------------
>> > >
>> > > _______________________________________________
>> > > users mailing list
>> > > users_at_[hidden]
>> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >
>> > _______________________________________________
>> > users mailing list
>> > users_at_[hidden]
>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> ------------------------------------------------------------------------
>> Windows Live: Make it easier for your friends to see what you’re up to
>> on Facebook. <http://go.microsoft.com/?linkid=9691811>
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users