On Dec 13, 2010, at 4:22 PM, David Singleton wrote:
> I didnt see memory binding in their explicitly.
You're correct; sorry, I was just referring to some general slides that showed some of the ideas that we're working on for next-generation affinity stuff. But memory binding will be included as well.
>> What OS and libnuma version are you running? It has been my experience that libnuma can lie on RHEL 5 and earlier. My (possibly flawed) understanding is that this is because of lack of proper kernel support; such "proper" kernel support was only added fairly recently (2.6.30something).
>
> That's interesting. By "lie", do you mean processes are not really memory bound?
I mean that even when usinga strict memory binding policy, if you numa_alloc* on node X, you can get memory on node Y.
> We're running 2.6.27.55 (and numactl 0.9.8-11.el5) and I've done quite a bit of
> testing that always looks correct.
That could well be.
On RHEL 5 (2.6.18 and numactl-0.9.8), the above "bad" behavior happens. With RHEL 6 (2.6.32 and numactl-2.0.3), it seems to be correct. Where exactly the issue was fixed, I'm not entirely sure.
>> That aside, it's somewhat disappointing that MPOL_PREFERRED is not working well and that you had to switch to MPOL_BIND. :-(
>
> I'm not sure its disappointing - I think it's just to be expected. For sites that
> drop caches or run a whole node memhog or reboot nodes between jobs, MPOL_PREFERRED
> will do the right thing. For sites that are not so careful or use suspend/resume
> scheduling, memory overcommits and some amount of page reclaim or paging on job
> startup will happen occasionally. Paying the extra cost of making sure that page
> reclaim or paging results in ideal locality is definitely a big win for a job
> overall. (Paging suspended jobs back in after they are resumed can undo some of
> their ideal placement but that can be handled.)
Fair enough.
>> Should we add an MCA parameter to switch between BIND and PREFERRED, and perhaps default to BIND?
>
> I'm not sure BIND should be the default for everyone - memory imbalanced jobs might
> page badly in this case. But, yes, we would like an MCA to choose and allow sites
> to select BIND as their default if they wish. An mpirun option like --bind-to-mem
> would need a preferred/affinity alternative and I'm not sure how of a nice notation/
> syntax for that.
How about:
--mca maffinity_libnuma_policy bind|preferred
I can do that for the v1.5 series, if you'd like. I can't really do it for v1.4 because that series is in "bug fix only" mode. However, given that we're revamping all of our affinity support, I don't know what the future interface will look like -- so the name may change, or ...
--
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/
|