Open MPI logo

Hardware Locality Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Development mailing list

Subject: [hwloc-devel] Bug report: hwloc topology broken when restricted to cpusets
From: Bernd Kallies (kallies_at_[hidden])
Date: 2010-07-13 05:22:35


I'd like to report the following bug with hwloc-1.0.1:

When creating a Linux cpuset (see cpuset(7)) with a subset of resources
of the current machine, and binding a hwloc application to this cpuset,
then the hwloc API may return a broken topology when restricting the
topology to objects that have children.

Working example on a machine running Linux kernel 2.6.16.60-0.42.5-smp
and containing two quad-core Nehalem Sockets X5570 with hyperthreading
enabled (shell prompt is >).

We start with the cpuset named / that contains all 16 logical processing
units and all two memory nodes). We run the lstopo command.
Then we create a cpuset containing the first 5 logical processing units,
and bind the current shell to it. We again run the lstopo command. With
option --merge the output looks strange /does not contain the second
NUMA node with L3 cache, PU #4 is left alone. To the end we compile a
small executable that tries to fetch the common parents of all processor
pairs of the topology. This application crashes with SIGSEGV.

> cat /proc/self/cpuset
/
> /sw/local/packages/hwloc-1.0.1/bin/lstopo
Machine (142GB)
  NUMANode #0 (phys=0 71GB) + Socket #0 + L3 #0 (8192KB)
    L2 #0 (256KB) + L1 #0 (32KB) + Core #0
      PU #0 (phys=0)
      PU #1 (phys=8)
    L2 #1 (256KB) + L1 #1 (32KB) + Core #1
      PU #2 (phys=1)
      PU #3 (phys=9)
    L2 #2 (256KB) + L1 #2 (32KB) + Core #2
      PU #4 (phys=2)
      PU #5 (phys=10)
    L2 #3 (256KB) + L1 #3 (32KB) + Core #3
      PU #6 (phys=3)
      PU #7 (phys=11)
  NUMANode #1 (phys=1 71GB) + Socket #1 + L3 #1 (8192KB)
    L2 #4 (256KB) + L1 #4 (32KB) + Core #4
      PU #8 (phys=4)
      PU #9 (phys=12)
    L2 #5 (256KB) + L1 #5 (32KB) + Core #5
      PU #10 (phys=5)
      PU #11 (phys=13)
    L2 #6 (256KB) + L1 #6 (32KB) + Core #6
      PU #12 (phys=6)
      PU #13 (phys=14)
    L2 #7 (256KB) + L1 #7 (32KB) + Core #7
      PU #14 (phys=7)
      PU #15 (phys=15)
> /sw/local/packages/hwloc-1.0.1/bin/lstopo --merge
Machine
  L3 #0 (8192KB)
    Core #0
      PU #0 (phys=0)
      PU #1 (phys=8)
    Core #1
      PU #2 (phys=1)
      PU #3 (phys=9)
    Core #2
      PU #4 (phys=2)
      PU #5 (phys=10)
    Core #3
      PU #6 (phys=3)
      PU #7 (phys=11)
  L3 #1 (8192KB)
    Core #4
      PU #8 (phys=4)
      PU #9 (phys=12)
    Core #5
      PU #10 (phys=5)
      PU #11 (phys=13)
    Core #6
      PU #12 (phys=6)
      PU #13 (phys=14)
    Core #7
      PU #14 (phys=7)
      PU #15 (phys=15)
> /bin/echo 0-4 > /dev/cpuset/mycpuset/cpus
> /bin/echo 0-1 > /dev/cpuset/mycpuset/mems
> /bin/echo $$ > /dev/cpuset/mycpuset/tasks
> /sw/local/packages/hwloc-1.0.1/bin/lstopo
Machine (142GB)
  NUMANode #0 (phys=0 71GB) + Socket #0 + L3 #0 (8192KB)
    L2 #0 (256KB) + L1 #0 (32KB) + Core #0 + PU #0 (phys=0)
    L2 #1 (256KB) + L1 #1 (32KB) + Core #1 + PU #1 (phys=1)
    L2 #2 (256KB) + L1 #2 (32KB) + Core #2 + PU #2 (phys=2)
    L2 #3 (256KB) + L1 #3 (32KB) + Core #3 + PU #3 (phys=3)
  NUMANode #1 (phys=1 71GB) + Socket #1 + L3 #1 (8192KB) + L2 #4 (256KB)
+ L1 #4 (32KB) + Core #4 + PU #4 (phys=4)
> /sw/local/packages/hwloc-1.0.1/bin/lstopo --merge
Machine
  L3 #0 (8192KB)
    PU #0 (phys=0)
    PU #1 (phys=1)
    PU #2 (phys=2)
    PU #3 (phys=3)
  PU #4 (phys=4)
> cat test.c
#include <hwloc.h>
int main(void) {
  int npu, i, j;
  hwloc_topology_t topology;
  hwloc_obj_t *pu, parent;

  /* Allocate and initialize topology object. */
  hwloc_topology_init(&topology);
  /* Perform the topology detection. */
  hwloc_topology_ignore_all_keep_structure(topology);
  hwloc_topology_load(topology);
  /* Collect all HWLOC_OBJ_PU */
  npu = hwloc_get_nbobjs_by_type(topology, HWLOC_OBJ_PU);
  pu = (hwloc_obj_t *)malloc(npu * sizeof(hwloc_obj_t *));
  pu[0] = hwloc_get_next_obj_by_type(topology, HWLOC_OBJ_PU, NULL);
  hwloc_get_closest_objs(topology, pu[0], &pu[1], npu - 1);
  /* Determine common parent */
  for(i = 0; i < npu - 1; i++) {
    for(j = i + 1; j < npu; j++) {
      parent = hwloc_get_common_ancestor_obj(topology, pu[i], pu[j]);
      printf("%2d %2d common parent type %d\n", i, j, parent->type);
    }
  }
}
> gcc -I/sw/local/packages/hwloc-1.0.1/include
-L/sw/local/packages/hwloc-1.0.1/lib
-Wl,-rpath,/sw/local/packages/hwloc-1.0.1/lib -lhwloc test.c
> ./a.out
 0 1 common parent type 4
 0 2 common parent type 4
 0 3 common parent type 4
Segmentation fault

-- 
Dr. Bernd Kallies
Konrad-Zuse-Zentrum für Informationstechnik Berlin
Takustr. 7
14195 Berlin
Tel: +49-30-84185-270
Fax: +49-30-84185-311
e-mail: kallies_at_[hidden]