Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: [OMPI devel] memory binding
From: David Singleton (David.Singleton_at_[hidden])
Date: 2010-12-10 16:56:33


Is there any plan to support NUMA memory binding for tasks?

Even with bind-to-core and memory affinity in 1.4.3 we were seeing 15-20%
variation in run times on a Nehalem cluster. This turned out to be mostly due
to bad page placement. Residual pagecache pages from the last job on a node (or
the memory of a suspended job in the case of preemption) could occasionally cause
a lot of non-local page placement. We hacked the libnuma module to MPOL_BIND
tasks to their local memory and eliminated the majority of this variability.
We are currently running with this as default behaviour since its "the right
thing" for 99% of jobs (we have an environment variable to back off to affinity
for the rest).

I'm guessing/hoping doing the above based on hwloc will be easier/more
maintainable. As a first pass, when is that likely to be an option?

David