Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] openmpi's mpi_comm_spawn integrated with sge?
From: Reuti (reuti_at_[hidden])
Date: 2011-01-25 12:27:06

Am 25.01.2011 um 12:32 schrieb Terry Dontje:

> On 01/25/2011 02:17 AM, Will Glover wrote:
>> Hi all,
>> I tried a google/mailing list search for this but came up with nothing, so here goes:
>> Is there any level of automation between open mpi's dynamic process management and the SGE queue manager?
>> In particular, can I make a call to mpi_comm_spawn and have SGE dynamically increase the number of slots?
>> This seems a little far fetched, but it would be really useful if this is possible. My application is 'restricted' to coarse-grain task parallelism and involves a work load that varies significantly during runtime (between 1 and ~100 parallel tasks). Dynamic process management would maintain an optimal number of processors and reduce idling.
>> Many thanks,
> This is an interesting idea but no integration has been done that would allow an MPI job to request more slots.

Similar ideas were on the former SGE mailing list a couple of times - having varying resource requests over the lifetime of a job (cores, memory, licenses, ...). This would mean in the end to have some kind of real-time-queuing system, as you have to have the necessary resources to be free in time for sure.

Besides this also some syntax for either requesting a "resource profile over time" when such a job is submitted would be necessary, or to allow a job while it's running issuing some kinds of commands to request/release resources on demand.

If you have such a "resource profile over time" for a bunch of jobs, it could then be extended to solve a cutting-stock problem where the unit to be cut would be time, e.g. arrange these 10 jobs that they finish in the least amount of time all together - and you could predict exactly when each job will end. This is getting really complex.


What can be done in your situation: have some kind of "background queue" with a nice value of 19, but the parallel job you submit to a queue with the default nice value 0. Although you request 100 cores and reserve them (i.e. the background queue shouldn't be suspended in such a case of course), the background queue will still run at full speed when nothing else is running on the nodes. When some of the parallel tasks are started on the nodes, they will get most of the computing time (this means: oversubscription by intention). The background queue can be used for less important jobs. Such a setup is usefull when your parallel application isn't running in parallel all the time like in your case.

-- Reuti

> --
> <Mail-Anhang.gif>
> Terry D. Dontje | Principal Software Engineer
> Developer Tools Engineering | +1.781.442.2631
> Oracle - Performance Technologies
> 95 Network Drive, Burlington, MA 01803
> Email terry.dontje_at_[hidden]
> _______________________________________________
> users mailing list
> users_at_[hidden]