Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Rich L. Graham (rlgraham_at_[hidden])
Date: 2005-07-18 08:01:08

On Jul 18, 2005, at 6:28 AM, Jeff Squyres wrote:

> On Jul 18, 2005, at 2:50 AM, Matt Leininger wrote:
>>> Generally speaking, if you launch <=N processes in a job on a node
>>> (where N == number of CPUs on that node), then we set processor
>>> affinity. We set each process's affinity to the CPU number according
>>> to the VPID ordering of the procs in that job on that node. So if
>>> you
>>> launch VPIDs 5, 6, 7, 8 on a node, 5 would go to processor 0, 6 would
>>> go to processor 1, etc. (it's an easy, locally-determined ordering).
>> You'd need to be careful with dual-core cpus. Say you launch a 4
>> task MPI job on a 4-socket dual core Opteron. You'd want to schedule
>> the tasks on nodes 0, 2, 4, 6 - not 0, 1, 2, 3 to get maximum memory
>> bandwidth to each MPI task.
> With the potential for non-trivial logic like this, perhaps the extra
> work for a real framework would be justified, then.
>> Also, how would this work with hybrid MPI+threading (either
>> pthreads
>> or OpenMP) applications? Let's say you have an 8 or 16 cpu node and
>> you
>> start up 2 MPI tasks with 4 compute threads in each task. The optimum
>> layout may not be running the MPI tasks on cpu's 0 and 1. Several
>> hybrid applications that ran on ASC White and now Purple will have
>> these
>> requirements.
> Hum. Good question. The MPI API doesn't really address this -- the
> MPI API is not aware of additional threads that are created until you
> call an MPI function (and even then, we're not currently checking which
> thread is calling -- that would just add latency).
> What do these applications do right now? Do they set their own
> processor / memory affinity? This might actually be outside the scope
> of MPI...? (I'mm not trying to shrug off responsibility, but this
> might be a case where the MPI simply doesn't have enough information,
> and to get that information [e.g., via MPI attributes or MPI info
> arguments] would be more hassle than the user just setting the affinity
> themselves...?)
> Comments?

If you set things up such that you can specify input parameters on where
to put each process, you have the flexibility you want. The locality
I have seen all mimiced the IRIX API, which had these capabilities. If
want some ideas, look at LA-MPI, it does this - the implementation is
pretty strange (just he coding), but it is there.


> --
> {+} Jeff Squyres
> {+} The Open MPI Project
> {+}
> _______________________________________________
> devel mailing list
> devel_at_[hidden]