Open MPI logo

Hardware Locality Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Development mailing list

Subject: [hwloc-devel] hwloc-distrib --among
From: Jirka Hladky (jhladky_at_[hidden])
Date: 2010-11-16 15:30:37


Hi Brice,

I had hold an internal presentation on hwloc. It was success, people has liked
it. One colleague has tried it on 8 socket box and we have found that memory
was installed in the wrong slots resulting in very strange NUMA configuration.

There was some discussion about hwloc-distrib --among

If I understand it correctly, --among accepts one of
{pu,core,socket,node,machine}

Should it support also option in form of socket:0 ?? I have tried it but it
does not work for me.

I do not understand results:

=======================================================
$ hwloc-calc --po --proclist $(hwloc-distrib --single --among machine 4)
0,2,1,3

$ hwloc-calc --po --proclist $(hwloc-distrib --single --among numa 4)
0,2,1,3

$ hwloc-calc --po --proclist $(hwloc-distrib --single --among socket 4)
0,2,1,3

This seems to be OK.

$ hwloc-calc --po --proclist $(hwloc-distrib --single --among core 4)
0,2,4,6

Among Socket:1 ??

$ hwloc-calc --po --proclist $(hwloc-distrib --single --among pu 4)
0,8,2,10

Among Core:0 and Core:1 ??

$ lstopo --physical
Machine (12GB)
  NUMANode p#0 (6144MB) + Socket p#1 + L3 (12MB)
    L2 (256KB) + L1 (32KB) + Core p#0
      PU p#0
      PU p#8
    L2 (256KB) + L1 (32KB) + Core p#1
      PU p#2
      PU p#10
    L2 (256KB) + L1 (32KB) + Core p#9
      PU p#4
      PU p#12
    L2 (256KB) + L1 (32KB) + Core p#10
      PU p#6
      PU p#14
  NUMANode p#1 (6134MB) + Socket p#0 + L3 (12MB)
    L2 (256KB) + L1 (32KB) + Core p#0
      PU p#1
      PU p#9
    L2 (256KB) + L1 (32KB) + Core p#1
      PU p#3
      PU p#11
    L2 (256KB) + L1 (32KB) + Core p#9
      PU p#5
      PU p#13
    L2 (256KB) + L1 (32KB) + Core p#10
      PU p#7
      PU p#15
========================================================

Could you explain the usage model for --among? Which arguemts are supported
and what effect they have?

I have also attached output of hwloc-gather-topology.sh for 8 Socket system
with two NUMA nodes. One NUMA node has 7 Sockets associated with it whereas
another socket has just Socket connected to it.

I have tried to use various --among and --ignore options to distribute 8
parallel jobs on a box so that each job is running on one socket. I was not
able to achieve this.

Could you please try it? What command should I use? Or is it perhaps some bug?

I have used 1.1rc2

=============8 socket system=======================
[root_at_hp-dl980g7-01 utils]# ./hwloc-calc --po --proclist $(./hwloc-distrib --
single --ignore machine 8)
0,1,16,24,32,40,48,56
[root_at_hp-dl980g7-01 utils]# ./hwloc-calc --po --proclist $(./hwloc-distrib --
single --ignore numa 8)
0,16,24,32,8,9,10,11
[root_at_hp-dl980g7-01 utils]# ./hwloc-calc --po --proclist $(./hwloc-distrib --
single --ignore socket 8)
0,16,24,32,8,9,10,11
[root_at_hp-dl980g7-01 utils]# ./hwloc-calc --po --proclist $(./hwloc-distrib --
single --ignore core 8)
0,16,24,32,8,9,10,11
[root_at_hp-dl980g7-01 utils]# ./hwloc-calc --po --proclist $(./hwloc-distrib --
single --ignore pu 8)
0,16,24,32,8,9,10,11
================================================

Please notice that Socket#1 is never chosen. Could you please help me with it?

Thanks a lot!
Jirka