The devel trunk has all of this in it - you can get that tarball from
the OMPI web site (take the nightly snapshot).
I plan to work on cpuset support beginning Tues morning.
On Aug 17, 2009, at 7:18 PM, Chris Samuel wrote:
> ----- "Eugene Loh" <Eugene.Loh_at_[hidden]> wrote:
> Hi Eugene,
>> It would be even better to have binding selections adapt to other
>> bindings on the system.
> This touches on the earlier thread about making OMPI aware
> of its cpuset/cgroup allocation on the node (for those sites
> that are using it), it might solve this issue quite nicely as
> OMPI would know precisely what cores & sockets were allocated
> for its use without having to worry about other HPC processes.
> No idea how to figure that out for processes outside of cpusets. :-(
>> In any case, regardless of what the best behavior is, I appreciate
>> the point about changing behavior in the middle of a stable release.
> Not a problem, and I take Jeff's point about 1.3 not being a
> super stable release and thus not being a blocker to changes
> such as this.
>> Arguably, leaving significant performance on the table in typical
>> situations is a bug that warrants fixing even in the middle of a
>> release, but I won't try to settle that debate here.
> I agree for those cases where there's no downside, and thinking
> further on your point of balancing between sockets I can see why
> that would limit the impact.
> Most of the cases I can think of that would be most adversely
> affected are down to other jobs binding to cores naively and if
> that's happening outside of cpusets then the cluster sysadmin
> has more to worry about from mixing those applications than
> mixing with OMPI ones which are just binding to sockets. :-)
> So I'll happily withdraw my objection on those grounds.
> *But* I would like to test this code out on a cluster with
> cpuset support enabled to see whether it will behave itself.
> Basically if I run a 4 core MPI job on a dual socket system
> which has been allocated only the cores on socket 0 what will
> happen when it tries to bind to socket 1 which is outside its
> cpuset ?
> Is there a 1.3 branch or tarball with these patches applied
> that I could test out ?
> Christopher Samuel - (03) 9925 4751 - Systems Manager
> The Victorian Partnership for Advanced Computing
> P.O. Box 201, Carlton South, VIC 3053, Australia
> VPAC is a not-for-profit Registered Research Agency
> devel mailing list