Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: [OMPI devel] NUMA bug in openib BTL device selection
From: Rolf vandeVaart (rvandevaart_at_[hidden])
Date: 2014-01-10 17:36:50


I believe I found a bug in openib BTL and just want to see if folks agree with this. When we are running on a NUMA node and we are bound to a CPU, we only ant to use the IB device that is closest to us. However, I observed that we always used both devices regardless. I believe there is a bug in computing the distances and the below change fixes it. This was introduced with r26391 when we switched to using hwloc to determine distances. It is a simple error where we are supposed to be accessing the array with i+j*size.

With this change, we will only use the IB devices that are close to us.

Any comments? Otherwise, I will commit.

Rolf

Index: ompi/mca/btl/openib/btl_openib_component.c
===================================================================
--- ompi/mca/btl/openib/btl_openib_component.c (revision 30175)
+++ ompi/mca/btl/openib/btl_openib_component.c (working copy)
@@ -2202,10 +2202,10 @@
         if (NULL != my_obj) {
             /* Distance may be asymetrical, so calculate both of them
                and take the max */
- a = hwloc_distances->latency[my_obj->logical_index *
+ a = hwloc_distances->latency[my_obj->logical_index +
                                          (ibv_obj->logical_index *
                                           hwloc_distances->nbobjs)];
- b = hwloc_distances->latency[ibv_obj->logical_index *
+ b = hwloc_distances->latency[ibv_obj->logical_index +
                                          (my_obj->logical_index *
                                           hwloc_distances->nbobjs)];
             distance = (a > b) ? a : b;
@@ -2224,10 +2224,10 @@
                                                             ibv_obj->cpuset,
                                                             HWLOC_OBJ_NODE, ++i)) {
 
- a = hwloc_distances->latency[node_obj->logical_index *
+ a = hwloc_distances->latency[node_obj->logical_index +
                                          (ibv_obj->logical_index *
                                           hwloc_distances->nbobjs)];
- b = hwloc_distances->latency[ibv_obj->logical_index *
+ b = hwloc_distances->latency[ibv_obj->logical_index +
                                          (node_obj->logical_index *
                                           hwloc_distances->nbobjs)];
             a = (a > b) ? a : b;
[rvandevaart_at_drossetti-ivy0 ompi-trunk-gpu-topo]$
-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information. Any unauthorized review, use, disclosure or distribution
is prohibited. If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------