Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Open MPI & Grid Engine/Grid Scheduler thread binding (was: New loadcheck)
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2011-07-14 19:46:09

Looping in the users mailing list so that Ralph and Oracle can comment...

On Jul 14, 2011, at 2:34 PM, Rayson Ho wrote:

> (CC'ing Jeff from the Open-MPI project...)
> On Thu, Jul 14, 2011 at 1:35 PM, Tad Kollar <tad.kollar_at_[hidden]> wrote:
>> As I thought more about it, I was afraid that might be the case, but hoped
>> sge_shepherd would do some magic for tightly-integrated jobs.
> To SGE, if each of the tasks is not started by sge_shepherd, then the
> only option is to set the binding mask to the allocation, which in
> your original case, was the whole system (48 CPUs).
>> We're running OpenMPI 1.5.3 if that makes a difference. Do you know of
>> anyone using an MVAPICH2 1.6 pe that can handle binding?
> I just downloaded Open MPI 1.5.4a and grep'ed the source, looks like
> it is not looking at the SGE_BINDING env variable that is set by SGE.
>> The serial case worked (its affinity list was '0' instead of '0-47'), so at
>> least we know that's in good shape :-)
> Please also submit a few more jobs and see if the new hwloc code is
> able to handle multiple jobs running on your AMD MC server.
>> My ultimate goal is for affinity support to be enabled and scheduled
>> automatically for all MPI users, i.e. without them having to do any more
>> than they would for a no-affinity job (otherwise I have a feeling most of
>> them would just ignore it). What do you think it will take to get to that
>> point?
> That's my goal since 2008...
> I started a mail thread, "processor affinity -- OpenMPI / batchsystem
> integration" to the Open MPI list in 2008. And in 2009, the conclusion
> was that Sun was saying that the binding info is set in the
> environment and Open MPI would perform the binding itself (so I
> assumed that was done):
> Revisiting the presentation (see: job2core.pdf link at the above URL),
> Sun's variable name is $SUNW_MP_BIND, so it is most likely Sun Cluster
> Toolkit implementation specific rather than a feature in Open MPI --
> and looking at the Open MPI code I don't see SUNW_MP_BIND referenced
> anywhere.
> I believe it is a matter of integrating the thread binding support
> between the 2 -- both SGE & Open MPI support thread binding. The
> harder part is to handle cross node binding as SGE binds threads
> locally only (not directly controlled by qmaster) -- may be a call to
> "qstat -cb -j <job id>" would do the trick, and the info is parsed and
> passed to mpirun via the "--rankfile" option.
> Rayson
>> Thanks!
>> Tad

Jeff Squyres
For corporate legal information go to: