You're probably aware: starting with 1.3.4, OMPI will detect and abide
by external bindings. So if grid engine sets a binding, we'll follow it.
On Oct 22, 2009, at 9:03 AM, Rayson Ho wrote:
> The code for the Job to Core Binding (aka. thread binding, or CPU
> binding) feature was checked into the Grid Engine project cvs. It uses
> OpenMPI's Portable Linux Processor Affinity (PLPA) library, and is
> topology and NUMA aware.
> The presentation from HPC Software Workshop '09:
> The design doc:
> Initial support is planned for 6.2 update 5 (current release is update
> 4, so update 5 is likely to be released in the next 2 or 3 months).
> On Tue, Sep 30, 2008 at 2:23 PM, Ralph Castain <rhc_at_[hidden]> wrote:
>> Note that we would also have to modify OMPI to:
>> 1. recognize these environmental variables, and
>> 2. use them to actually set the binding, instead of using OMPI-
>> Not a big deal to do, but not something currently in the system.
>> Since we
>> launch through our own daemons (something that isn't likely to
>> change in
>> your time frame), these changes would be required.
>> Otherwise, we could come up with some method by which you could
>> mapper information we use. While I agree with Jeff that having you
>> tell us
>> which cores to use for each rank would generally be better, it does
>> issues when users want specific mapping algorithms that you might not
>> support. For example, we are working on mappers that will take
>> input from
>> the user regarding comm topology plus system info on network wiring
>> and generate a near-optimal mapping of ranks. As part of that,
>> users may
>> request some number of cores be reserved for that rank for
>> threading or
>> other purposes.
>> So perhaps both options would be best - give us the list of cores
>> to us so we can map and do affinity, and pass in your own mapping.
>> with some logic so we can decide which to use based on whether OMPI
>> or GE
>> did the mapping??
>> Not sure here - just thinking out loud.
>> On Sep 30, 2008, at 12:58 PM, Jeff Squyres wrote:
>>> On Sep 30, 2008, at 2:51 PM, Rayson Ho wrote:
>>>> Restarting this discussion. A new update version of Grid Engine 6.2
>>>> will come out early next year , and I really hope that we can
>>>> at least the interface defined.
>>>> At the minimum, is it enough for the batch system to tell OpenMPI
>>>> an env variable which core (or virtual core, in the SMT case) to
>>>> binding the first MPI task?? I guess an added bonus would be
>>>> information about the number of processors to skip (the stride)
>>>> between the sibling tasks?? Stride of one is usually the case, but
>>>> something larger than one would allow the batch system to control
>>>> level of cache and memory bandwidth sharing between the MPI
>>> Wouldn't it be better to give us a specific list of cores to bind
>>> to? As
>>> core counts go up in servers, I think we may see a re-emergence of
>>> multiple MPI jobs on a single server. And as core counts go even
>>> then fragmentation of available cores over time is possible/likely.
>>> Would you be giving us a list of *relative* cores to bind to
>>> (i.e., "bind
>>> to the Nth online core on the machine" -- which may be different
>>> than the
>>> OS's ID for that processor) or will you be giving us the actual OS
>>> processor ID(s) to bind to?
>>> Jeff Squyres
>>> Cisco Systems
>>> devel mailing list
>> devel mailing list
> users mailing list