Open MPI logo

Hardware Locality Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Development mailing list

Subject: Re: [hwloc-devel] Bug report: hwloc topology broken when restricted to cpusets
From: Brice Goglin (Brice.Goglin_at_[hidden])
Date: 2010-07-13 05:46:10


Le 13/07/2010 11:22, Bernd Kallies a écrit :
>> /bin/echo 0-4 > /dev/cpuset/mycpuset/cpus
>> /bin/echo 0-1 > /dev/cpuset/mycpuset/mems
>> /bin/echo $$ > /dev/cpuset/mycpuset/tasks
>> /sw/local/packages/hwloc-1.0.1/bin/lstopo
>>
> Machine (142GB)
> NUMANode #0 (phys=0 71GB) + Socket #0 + L3 #0 (8192KB)
> L2 #0 (256KB) + L1 #0 (32KB) + Core #0 + PU #0 (phys=0)
> L2 #1 (256KB) + L1 #1 (32KB) + Core #1 + PU #1 (phys=1)
> L2 #2 (256KB) + L1 #2 (32KB) + Core #2 + PU #2 (phys=2)
> L2 #3 (256KB) + L1 #3 (32KB) + Core #3 + PU #3 (phys=3)
> NUMANode #1 (phys=1 71GB) + Socket #1 + L3 #1 (8192KB) + L2 #4 (256KB)
> + L1 #4 (32KB) + Core #4 + PU #4 (phys=4)
>
>> /sw/local/packages/hwloc-1.0.1/bin/lstopo --merge
>>
> Machine
> L3 #0 (8192KB)
> PU #0 (phys=0)
> PU #1 (phys=1)
> PU #2 (phys=2)
> PU #3 (phys=3)
> PU #4 (phys=4)
>

This looks good to me. When --merge is given, we only keep the most
important objects to simplify the output. PU is considered the most
important object type, since that's where you bind processes in the end.
That's why

  NUMANode #1 (phys=1 71GB) + Socket #1 + L3 #1 (8192KB) + L2 #4 (256KB) + L1 #4 (32KB) + Core #4 + PU #4 (phys=4)

is replaced by

  PU #4 (phys=4)

What would like instead?

If you don't want to loose any information, just don't use --merge.

> #include <hwloc.h>
> int main(void) {
> int npu, i, j;
> hwloc_topology_t topology;
> hwloc_obj_t *pu, parent;
>
> /* Allocate and initialize topology object. */
> hwloc_topology_init(&topology);
> /* Perform the topology detection. */
> hwloc_topology_ignore_all_keep_structure(topology);
> hwloc_topology_load(topology);
> /* Collect all HWLOC_OBJ_PU */
> npu = hwloc_get_nbobjs_by_type(topology, HWLOC_OBJ_PU);
> pu = (hwloc_obj_t *)malloc(npu * sizeof(hwloc_obj_t *));
> pu[0] = hwloc_get_next_obj_by_type(topology, HWLOC_OBJ_PU, NULL);
> hwloc_get_closest_objs(topology, pu[0], &pu[1], npu - 1);
> /* Determine common parent */
> for(i = 0; i < npu - 1; i++) {
> for(j = i + 1; j < npu; j++) {
> parent = hwloc_get_common_ancestor_obj(topology, pu[i], pu[j]);
> printf("%2d %2d common parent type %d\n", i, j, parent->type);
> }
> }
> }
>
>> gcc -I/sw/local/packages/hwloc-1.0.1/include
>>
> -L/sw/local/packages/hwloc-1.0.1/lib
> -Wl,-rpath,/sw/local/packages/hwloc-1.0.1/lib -lhwloc test.c
>
>> ./a.out
>>
> 0 1 common parent type 4
> 0 2 common parent type 4
> 0 3 common parent type 4
> Segmentation fault
>

I'll debug this, thanks.

Brice