Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Oversubscription of nodes with Torque and OpenMPI
From: Reuti (reuti_at_[hidden])
Date: 2013-11-22 13:46:07


Am 22.11.2013 um 19:34 schrieb Jason Gans:

> On 11/22/13 11:18 AM, Lloyd Brown wrote:
>> As far as I understand, the mpirun will assign processes to hosts in the
>> hostlist ($PBS_NODEFILE) sequentially, and if it runs out of hosts in
>> the list, it starts over at the top of the file.
>>
>> Theoretically, you should be able to request specific hostnames, and the
>> processor counts per hostname, in your torque submit request. I'm not
>> sure if this is correct (we don't use Torque here anymore, and I'm going
>> off memory), but it should be approximately correct:
>>
>>> qsub -l nodes=n0000:2+n0001:2+n0002:8+n0003:8+n0004:2+n0005:2+n0006:2+n0007:4 ...
> Thanks! This is awkward, but it did the trick. To get the desired behavior I first
> had to provide a "fake" nodes file to Torque (where all of the nodes were listed
> as having a large number of processors, i.e. np=8). Now I can submit jobs using:
>
> qsub -I -l nodes=n0000:ppn=2+n0001:ppn=2+n0002:ppn=8+...

This shouldn't be necessary when Torque knows the number of cores in each machine and you request the suggested 24 ones.

-- Reuti

>
> and get the expected behavior (including the expected $PBS_NODFILE, where the
> name of each node appears "ppn" number of times).
>
> Thanks to everyone who responded!
>
> Regards,
>
> Jason
>> Granted, that's awkward, but I'm not sure if there's another way in
>> Torque to request different numbers of processors per node. You might
>> ask on the Torque Users list. They might tell you to change the nodes
>> file to reflect the number of actual processes you want on each node,
>> rather than the number of physical processors on the hosts. Whether
>> this works for you, depends on whether you want this type of
>> oversubscription to happen all the time, or on a per-job basis, etc.
>>
>>
>> Lloyd Brown
>> Systems Administrator
>> Fulton Supercomputing Lab
>> Brigham Young University
>> http://marylou.byu.edu
>>
>> On 11/22/2013 11:11 AM, Gans, Jason D wrote:
>>> I have tried the 1.7 series (specifically 1.7.3) and I get the same
>>> behavior.
>>>
>>> When I run "mpirun -oversubscribe -np 24 hostname", three instances of
>>> "hostname" are run on each node.
>>>
>>> The contents of the $PBS_NODEFILE are:
>>> n0007
>>> n0006
>>> n0005
>>> n0004
>>> n0003
>>> n0002
>>> n0001
>>> n0000
>>>
>>> but, since I have compiled OpenMPI using the "--with-tm", it appears
>>> that OpenMPI is not using the $PBS_NODEFILE (which I tested by modifying
>>> the torque pbs_mom to write a $PBS_NODEFILE that contained "slot=xx"
>>> information for each node. mpirun complained when I did this).
>>>
>>> Regards,
>>>
>>> Jason
>>>
>>> ------------------------------------------------------------------------
>>> *From:* users [users-bounces_at_[hidden]] on behalf of Ralph Castain
>>> [rhc_at_[hidden]]
>>> *Sent:* Friday, November 22, 2013 11:04 AM
>>> *To:* Open MPI Users
>>> *Subject:* Re: [OMPI users] Oversubscription of nodes with Torque and
>>> OpenMPI
>>>
>>> Really shouldn't matter - this is clearly a bug in OMPI if it is doing
>>> mapping as you describe. Out of curiosity, have you tried the 1.7
>>> series? Does it behave the same?
>>>
>>> I can take a look at the code later today and try to figure out what
>>> happened.
>>>
>>> On Nov 22, 2013, at 9:56 AM, Jason Gans <jgans_at_[hidden]
>>> <mailto:jgans_at_[hidden]>> wrote:
>>>
>>>> On 11/22/13 10:47 AM, Reuti wrote:
>>>>> Hi,
>>>>>
>>>>> Am 22.11.2013 um 17:32 schrieb Gans, Jason D:
>>>>>
>>>>>> I would like to run an instance of my application on every *core* of
>>>>>> a small cluster. I am using Torque 2.5.12 to run jobs on the
>>>>>> cluster. The cluster in question is a heterogeneous collection of
>>>>>> machines that are all past their prime. Specifically, the number of
>>>>>> cores ranges from 2-8. Here is the Torque "nodes" file:
>>>>>>
>>>>>> n0000 np=2
>>>>>> n0001 np=2
>>>>>> n0002 np=8
>>>>>> n0003 np=8
>>>>>> n0004 np=2
>>>>>> n0005 np=2
>>>>>> n0006 np=2
>>>>>> n0007 np=4
>>>>>>
>>>>>> When I use openmpi-1.6.3, I can oversubscribe nodes but the tasks
>>>>>> are allocated to nodes without regard to the number of cores on each
>>>>>> node (specified by the "np=xx" in the nodes file). For example, when
>>>>>> I run "mpirun -np 24 hostname", mpirun places three instances of
>>>>>> "hostname" on each node, despite the fact that some nodes only have
>>>>>> two processors and some have more.
>>>>> You submitted the job itself by requesting 24 cores for it too?
>>>>>
>>>>> -- Reuti
>>>> Since there are only 8 Torque nodes in the cluster, I submitted the
>>>> job by requesting 8 nodes, i.e. "qsub -I -l nodes=8".
>>>>>
>>>>>> Is there a way to have OpenMPI "gracefully" oversubscribe nodes by
>>>>>> allocating instances based on the "np=xx" information in the Torque
>>>>>> nodes file? It this a Torque problem?
>>>>>>
>>>>>> p.s. I do get the desired behavior when I run *without* Torque and
>>>>>> specify the following machine file to mpirun:
>>>>>>
>>>>>> n0000 slots=2
>>>>>> n0001 slots=2
>>>>>> n0002 slots=8
>>>>>> n0003 slots=8
>>>>>> n0004 slots=2
>>>>>> n0005 slots=2
>>>>>> n0006 slots=2
>>>>>> n0007 slots=4
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Jason
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> users_at_[hidden] <mailto:users_at_[hidden]>
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden] <mailto:users_at_[hidden]>
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden] <mailto:users_at_[hidden]>
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users