FWIW: Jeff and I chatted about this on the phone and came up with two
issues that need resolving:
1. we use mpi_paffinity_alone to indicate that we should bind
processes, yet the orteds have no way of seeing that MCA param as it
is registered and evaluated in the MPI layer. We propose to resolve
this by (a) declaring an opal_paffinity_alone MCA param in the
paffinity framework, and then (b) declaring an alias of
mpi_paffinity_alone for it, also in the paffinity framework. This
obviously is an abstraction break, but we feel it is an acceptable one
under the circumstances.
Our apologies to Lenny, whose ears were boxed over doing just this
This will allow the orteds to check to see if processes should be
bound before launching them.
2. we would not be able to bind processes launched without daemons
under systems that do not provide their own process binding
capability. For example, on Torque, we have an ability to natively
launch processes from within mpirun - those processes currently can
bind themselves in MPI_Init, but would not be able to do so any longer
under this proposed change.
To alleviate that problem, we propose to leave the process binding
code that is currently in MPI_Init, but surround it with a test to see
if an MCA param has been set indicating that the proc is to use that
code to bind itself. Thus, when launching without daemons (but via
mpirun), we can set the flag and instruct the procs to bind
themselves. However, procs that are launched without daemons via
something which has its own binding capability (e.g., SLURM), and
procs that were launched via daemon (and hence would have already been
bound), would not attempt to do so.
Any further thoughts are welcome...
On May 7, 2009, at 12:59 PM, Ralph Castain wrote:
> I can do the coding - just want to ensure interested others get
> their $0.002 in on how it should work.
> I came up with a way to do it that doesn't require changes to the
> paffinity framework. I can complete the prototype next week on an hg
> branch and let you look at it. Mostly consists of moving what is now
> in MPI_Init into the odls modules between the fork and exec, as
> Brian suggested.
> On May 7, 2009, at 12:43 PM, Terry Dontje wrote:
>> Brian W. Barrett wrote:
>>> On Wed, 6 May 2009, Ralph Castain wrote:
>>>> Any thoughts on this? Should we change it?
>>> Yes, we should change this (IMHO) :).
>> Me too.
>>>> If so, who wants to be involved in the re-design? I'm pretty sure
>>>> it would require some modification of the paffinity framework,
>>>> plus some minor mods to the odls framework and (since you cannot
>>>> bind a process other than yourself) addition of a new small
>>>> "proxy" script that would bind-then-exec each process started by
>>>> the orted (Eugene posted a candidate on the user list, though we
>>>> will have to deal with some system-specific issues in it).
>>> I can't contribute a whole lot of time, but I'd be happy to lurk,
>>> offer advice, and write some small bits of code. But I definitely
>>> can't lead.
>>> Fist offering of opinion from me. I think we can avoid the
>>> "proxy" script by doing the binding after the fork but before the
>>> exec. This will definitely require minor changes to the odls and
>>> probably a bunch of changes to the paffinity framework. This will
>>> make things slightly less fragile than a script would, and yet get
>>> us what we want.
>> I'll have to talk with Len to see if Sun has any time to allocate
>> to this.
>> devel mailing list