Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Heads up on new feature to 1.3.4
From: Ralph Castain (rhc_at_[hidden])
Date: 2009-08-17 22:05:25


Hi Chris

The devel trunk has all of this in it - you can get that tarball from
the OMPI web site (take the nightly snapshot).

I plan to work on cpuset support beginning Tues morning.

Ralph

On Aug 17, 2009, at 7:18 PM, Chris Samuel wrote:

>
> ----- "Eugene Loh" <Eugene.Loh_at_[hidden]> wrote:
>
> Hi Eugene,
>
> [...]
>> It would be even better to have binding selections adapt to other
>> bindings on the system.
>
> Indeed!
>
> This touches on the earlier thread about making OMPI aware
> of its cpuset/cgroup allocation on the node (for those sites
> that are using it), it might solve this issue quite nicely as
> OMPI would know precisely what cores & sockets were allocated
> for its use without having to worry about other HPC processes.
>
> No idea how to figure that out for processes outside of cpusets. :-(
>
>> In any case, regardless of what the best behavior is, I appreciate
>> the point about changing behavior in the middle of a stable release.
>
> Not a problem, and I take Jeff's point about 1.3 not being a
> super stable release and thus not being a blocker to changes
> such as this.
>
>> Arguably, leaving significant performance on the table in typical
>> situations is a bug that warrants fixing even in the middle of a
>> release, but I won't try to settle that debate here.
>
> I agree for those cases where there's no downside, and thinking
> further on your point of balancing between sockets I can see why
> that would limit the impact.
>
> Most of the cases I can think of that would be most adversely
> affected are down to other jobs binding to cores naively and if
> that's happening outside of cpusets then the cluster sysadmin
> has more to worry about from mixing those applications than
> mixing with OMPI ones which are just binding to sockets. :-)
>
> So I'll happily withdraw my objection on those grounds.
>
> *But* I would like to test this code out on a cluster with
> cpuset support enabled to see whether it will behave itself.
>
> Basically if I run a 4 core MPI job on a dual socket system
> which has been allocated only the cores on socket 0 what will
> happen when it tries to bind to socket 1 which is outside its
> cpuset ?
>
> Is there a 1.3 branch or tarball with these patches applied
> that I could test out ?
>
> cheers,
> Chris
> --
> Christopher Samuel - (03) 9925 4751 - Systems Manager
> The Victorian Partnership for Advanced Computing
> P.O. Box 201, Carlton South, VIC 3053, Australia
> VPAC is a not-for-profit Registered Research Agency
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel