I will commit it next week.
I did saw performance improvement in the worst scenario. I believe that with increasing numbers of CPUs the improvement will more noticable.
On Thu, Jun 12, 2008 at 1:00 AM, Brad Benton <firstname.lastname@example.org
I've looked over the code more and did some initial tests with it. It didn't seem to hurt anything in the default case. I also consulted with George and he would like to see these patches get in for 1.3. Since it seems to do no harm in the default case, I am okay with that as well. So, unless anyone else has objections, please go ahead and apply this to the trunk.
BTW, in your testing, were you able to measure any noticeable performance improvements?
Thanks & Regards,
On Tue, Jun 10, 2008 at 2:32 PM, Brad Benton <email@example.com
My apologies for not replying sooner. I would like to look at these patches a bit more. Since this adds a feature (NUMA awareness in the SM BTL) as well as introduces interface changes to the maffinity framework, I would also like to get George's opinion before deciding whether or not go bring this into the trunk before branching for 1.3.
On Tue, Jun 10, 2008 at 10:52 AM, Lenny Verkhovsky <firstname.lastname@example.org
I didn't want to bring it on the teleconference
but I want to commit Gleb's NUMA awareness patch before you branching trunk.
Since I didn't get any objection / response about it I guess it's OK.
---------- Forwarded message ----------
From: Lenny Verkhovsky
Date: Tue, Jun 3, 2008 at 2:38 PM
Subject: [OMPI devel] SM BTL NUMA awareness patches
To: Open MPI Developers <email@example.com
If there are no comments for this patch
I can commit it.
Attached two patches implement NUMA awareness in SM BTL. The first one
adds two new functions to maffinity framework required by the second
patch. The functions are:
opal_maffinity_base_node_name_to_id() - gets a string that represents a
memory node name and translates
it to memory node id.
opal_maffinity_base_bind() - binds an address range to specific
The bind() function cannot be implemented by all maffinity components.
(There is no way first_use maffinity component can implement such
functionality). In this case this function can be set to NULL.
The second one adds NUMA awareness support to SM BTL and SM MPOOL. Each
process determines what CPU it is running on and exchange this info with
other local processes. Each process creates separate MPOOL for every
memory node available and use them to allocate memory on specific memory
nodes if needed. For instance circular buffer memory is always allocated
on memory node local to receiver process.
To use this on a Linux machine carto file with HW topology description should
be provided. Processes should be bound to specific CPU (by specifying
rank file for instance) and session directory should be created on tmpfs
file system (otherwise Linux ignores memory binding commands) by
setting orte_tmpdir_base parameter to point to tmpfs mount point.
Questions and suggestion are alway welcome.