Open MPI logo

Hardware Locality Users' Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Users mailing list

Subject: [hwloc-users] Questions to lstopo and hwloc-bind
From: Siegmar Gross (Siegmar.Gross_at_[hidden])
Date: 2012-09-14 01:48:03


Hi,

I have installed hwloc-1.5 on our systems and get the following output
when I run "lstopo" on a Sun Server M4000 (two quad-core processors with
two hardware-threads each).

rs0 fd1026 101 lstopo
Machine (32GB) + NUMANode L#0 (P#1 32GB)
  Socket L#0
    Core L#0
      PU L#0 (P#0)
      PU L#1 (P#1)
    Core L#1
      PU L#2 (P#2)
      PU L#3 (P#3)
    Core L#2
      PU L#4 (P#4)
      PU L#5 (P#5)
    Core L#3
      PU L#6 (P#6)
      PU L#7 (P#7)
  Socket L#1
    Core L#4
      PU L#8 (P#8)
      PU L#9 (P#9)
    Core L#5
      PU L#10 (P#10)
      PU L#11 (P#11)
    Core L#6
      PU L#12 (P#12)
      PU L#13 (P#13)
    Core L#7
      PU L#14 (P#14)
      PU L#15 (P#15)

When I run the command on a Sun Ultra 45 with two single core processors
I get the following output.

tyr fd1026 116 lstopo
Machine (4096MB)
  NUMANode L#0 (P#2 2048MB) + Socket L#0 + Core L#0 + PU L#0 (P#0)
  NUMANode L#1 (P#1 2048MB) + Socket L#1 + Core L#1 + PU L#1 (P#1)

First question: Why reports "lstopo" two NUMA nodes on a Sun Ultra and
only one NUMA node on the M4000 although both machines are equipped
with two processors and both machines are running Solaris 10?

rs0 fd1026 101 uname -a
SunOS rs0.informatik.hs-fulda.de 5.10 Generic_147440-21 sun4u
sparc SUNW,SPARC-Enterprise Solaris

tyr fd1026 117 uname -a
SunOS tyr.informatik.hs-fulda.de 5.10 Generic_147440-23 sun4u
  sparc SUNW,A70 Solaris

I get the following error when I try to bind a process to a core
on the M4000 machine.

rs0 fd1026 104 hwloc-bind socket:0.core:0 -l date
hwloc_set_cpubind 0x00000003 failed (errno 18 Cross-device link)
Fri Sep 14 07:37:14 CEST 2012

I can use the following command which works for all 16 hardware threads.

rs0 fd1026 105 hwloc-bind pu:0 -l date
Fri Sep 14 07:38:37 CEST 2012

It's no problem to use both commands on the Sun Ultra.

tyr fd1026 121 hwloc-bind socket:0.core:0 -l date
Fri Sep 14 07:40:22 CEST 2012
tyr fd1026 122 hwloc-bind socket:1.core:0 -l date
Fri Sep 14 07:40:26 CEST 2012
tyr fd1026 123 hwloc-bind pu:0 -l date
Fri Sep 14 07:40:37 CEST 2012
tyr fd1026 124 hwloc-bind pu:1 -l date
Fri Sep 14 07:40:41 CEST 2012

Second question: How can I find out which bindings are allowed when
I know the output from "lstopo"? I have no idea why I get "errno 18
Cross-device link" on the M4000.

Thank you very much for any answers and suggestions in advance.

Kind regards

Siegmar