Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] restricting a job to a set of hosts
From: Erik Nelson (nelsonerikd_at_[hidden])
Date: 2012-07-26 21:54:15


I see. Thank you both for the prompt replies.

On Thu, Jul 26, 2012 at 8:21 PM, Ralph Castain <rhc_at_[hidden]> wrote:

> Application processes will *only* be placed on nodes included in the
> allocation. The -nolocal flag is intended to ensure that no application
> processes are started on the same node as mpirun in the case where that
> node is included in the allocation. This happens, for example, with Torque,
> where mpirun is executed on one of the allocated nodes.
>
> I believe SGE doesn't do that - and so the allocation won't include the
> submit host, in which case you don't need -nolocal.
>
>
> On Jul 26, 2012, at 5:58 PM, Erik Nelson wrote:
>
> I was under the impression that the -nolocal option keeps processes off
> the submit
> host (since there may be hundreds or thousands of jobs submitted at any
> time,
> and we don't want this host to be overloaded).
>
> My understanding of what you said in you last email is that, by listing
> the hosts, I
> automatically send all processes (parent and child, or master and slave if
> you
> prefer) to the specified list of hosts.
>
> Reading your email below, it looks like this was the correct understanding.
>
>
> On Thu, Jul 26, 2012 at 5:20 PM, Reuti <reuti_at_[hidden]> wrote:
>
>> Am 26.07.2012 um 23:58 schrieb Erik Nelson:
>>
>> > Reuti,
>> >
>> > Thank you. Our queue is backed up, so it will take a little while
>> before I can try this.
>> >
>> > I assume that by specifying the nodes this way, I don't need (and it
>> would confuse
>> > the system) to add -nolocal. In other words, qsub will try to put the
>> parent node
>> > somewhere in this set.
>> >
>> > Is this the idea?
>>
>> Depends what you refer to by "parent node". I assume you mean the submit
>> host. This is never included in any created selection of SGE unless it's an
>> execution host too.
>>
>> The master host of the parallel job (i.e. the one where the jobscript
>> with the `mpiexec` is running) will be used as a normal machine from MPI's
>> point of view.
>>
>> -- Reuti
>>
>>
>> > Erik
>> >
>> >
>> > On Thu, Jul 26, 2012 at 4:48 PM, Reuti <reuti_at_[hidden]>
>> wrote:
>> > Am 26.07.2012 um 23:33 schrieb Erik Nelson:
>> >
>> > > I have a purely parallel job that runs ~100 processes. Each process
>> has ~identical
>> > > overhead so the speed of the program is dominated by the slowest
>> processor.
>> > >
>> > > For this reason, I would like to restrict the job to a specific set
>> of identical (fast)
>> > > processors on our cluster.
>> > >
>> > > I read the FAQ on -hosts and -hostfile, but it is still unclear to me
>> what affect these
>> > > directives will have in a queuing environment.
>> > >
>> > > Currently, I submit the job using the "qsub" command in the "sge"
>> environment as :
>> > >
>> > > qsub -pe mpich 101 jobfile.job
>> > >
>> > > where jobfile contains the command
>> > >
>> > > mpirun -np 101 -nolocal ./executable
>> >
>> > I would leave -nolocal out here.
>> >
>> > $ qsub -l
>> "h=compute-5-[1-9]|compute-5-1[0-9]|compute-5-2[0-9]|compute-5-3[0-2]" -pe
>> mpich 101 jobfile.job
>> >
>> > -- Reuti
>> >
>> >
>> > > I would like to restrict the job to nodes compute-5-1 to compute-5-32
>> on our machine,
>> > > each containing 8 cpu's (slots). How do I go about this?
>> > >
>> > > Thanks, Erik
>> > >
>> > > --
>> > > Erik Nelson
>> > >
>> > > Howard Hughes Medical Institute
>> > > 6001 Forest Park Blvd., Room ND10.124
>> > > Dallas, Texas 75235-9050
>> > >
>> > > p : 214 645 5981
>> > > f : 214 645 5948
>> > > _______________________________________________
>> > > users mailing list
>> > > users_at_[hidden]
>> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >
>> >
>> > _______________________________________________
>> > users mailing list
>> > users_at_[hidden]
>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >
>> >
>> >
>> > --
>> > Erik Nelson
>> >
>> > Howard Hughes Medical Institute
>> > 6001 Forest Park Blvd., Room ND10.124
>> > Dallas, Texas 75235-9050
>> >
>> > p : 214 645 5981
>> > f : 214 645 5948
>> > _______________________________________________
>> > users mailing list
>> > users_at_[hidden]
>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
>
> --
> Erik Nelson
>
> Howard Hughes Medical Institute
> 6001 Forest Park Blvd., Room ND10.124
> Dallas, Texas 75235-9050
>
> p : 214 645 5981
> f : 214 645 5948
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

-- 
Erik Nelson
Howard Hughes Medical Institute
6001 Forest Park Blvd., Room ND10.124
Dallas, Texas 75235-9050
p : 214 645 5981
f : 214 645 5948