Open MPI logo

Portable Hardware Locality (hwloc)

  |   Home   |   Support   |   FAQ   |  

The Best of lstopo

Here are lstopo graphical outputs on miscellaneous platforms, from common servers to huge many-core platforms and strange machines. If your output is even nicer, please send it together with the XML!

Click on images to open them.

Common Servers

2x Xeon Clovertown E5345 (from 2007, with hwloc v1.11):

(link to here)


2x Xeon Haswell-EP E5-2683v3 (from 2014, with hwloc v1.11).
Processors are configured in Cluster-on-Die mode which shows 2 NUMA nodes per package:

(link to here)


4x Opteron Magny-Cours 6272 (from 2012, with hwloc v1.11):

(link to here)


8x Opteron Istanbul 8439SE (from 2010, with hwloc v1.11):

(link to here)


Servers with I/O devices

2x Xeon Sandy-Bridge-EP E5-2680 + 1x Xeon Phi + 1x InfiniBand HCA (from 2013, with hwloc v1.11):

(link to here)


2x Xeon Sandy-Bridge-EP E5-2690 + 2x Cisco usNIC (from 2013, with hwloc v1.11).
Many identical unused usNIC PCI virtual functions are collapsed in the graphical output:

(link to here)


2x Xeon Sandy-Bridge-EP E5-2650 + 2x AMD GPUs (from 2013, with hwloc v1.11).
GPUs are reported as OpenCL devices:

(link to here)


2x Xeon Ivy-Bridge-EP E5-2680v2 + 2x NVIDIA GPUs (from 2013, with hwloc v1.11).
GPUs are reported as CUDA devices and X11 display :1.0:

(link to here)


2x Xeon Westmere-EP X5650 + 8x NVIDIA GPUs appearing as CUDA devices (Tyan S7015 from 2010, with hwloc v1.11):

(link to here)


4x Xeon Westmere-EX E7-4870 (Dell PowerEdge R910 from 2012, with hwloc v1.11).
A Group of two packages is used to better report the I/O-affinity of the PCI bus:

(link to here)


2x Power8 with 2 NUMA node each + 1x InfiniBand HCA (IBM Power S822L from 2015, with hwloc v1.11).
20 cores and 160 hardware threads total:

(link to here)


Big platforms (make sure you have a giant screen)

Intel Knights Landing.
MCDRAM is configured in Hybrid mode (half as a memory-side cache, half as an additional high-bandwidth NUMA node).
SubNUMA clustering (SNC-4) is also enabled, for a total of 8 NUMA nodes.
72 cores and 288 threads:

(link to here)


Dual-socket Oracle SPARC T7 (from 2016, with hwloc v1.11).
32 cores in each socket, with 8 hardware threads each:

(link to here)


Oracle SPARC T8 (from 2017, with hwloc 1.11).
32 cores with 8 hardware threads each:

(link to here)


16x Xeon Dunnington X7460 with IBM custom NUMA interconnect (IBM x3950 M2 from 2009, with hwloc v1.11).
96 cores total:

(link to here)


20x Xeon Westmere-EX E7-8837 (SGI Altix UV 100 from 2012, with hwloc v1.11).
160 cores total.
The matrix of distances between NUMA nodes is used to improve the topology by creating Groups of two processors, which actually correspond to physical Blades:

(link to here)


64x Itanium-2 Montvale 9140N (HP Superdome SX2000 from 2009, with hwloc v1.11).
128 cores total, and one additional CPU-less NUMA node.
This old Linux kernel does not report cache information:

(link to here)


128x Itanium-2 Montvale 9150M (SGI Altix 4700 from 2008, with hwloc v1.11).
256 cores total.
Two levels of Groups are created based on the NUMA distance matrix. They corresponds to SGI racks of 4 IRUs containing 4 Blades each.
This old Linux kernel does not report cache information:

(link to here)


48x Xeon Nehalem-EX X7560 with distance-based groups (SGI Altix UV1000 from 2010, with hwloc v1.11).
384 cores and 768 hyperthreads total:

(link to here)


56x Xeon Sandy-Bridge-EP E5-4640 + 2x NVIDIA GPUs + 2x Xeon Phi (SGI Altix UV2000 from 2014, with hwloc v1.11).
448 cores and 896 hyperthreads total.
The machine is made of 26 blades of 2 processors (appearing as Group0 white boxes),2 blades with 1 processor and 1 NVIDIA GPU (blue boxes),and 2 blades with 1 processor and 1 Xeon Phi (pink boxes).
Red hyperthreads are unavailable to the current user due to Linux cgroups.
Blue and pink blades were colored using the lstopoStyle custom attribute:

(link to here)


16x Power7 (IBM Power 780 from 2011, with hwloc v1.11).
64 cores and 256 hardware threads total.
The buggy hardware Device-Tree reports 8 single-core packages instead of each octo-core package:

(link to here)


Unusual platforms (you won't often run hwloc there)

BlueGene/Q node (from 2013, with hwloc v1.11):

(link to here)


Xeon Phi Knights Corner 7120 (from 2014, with hwloc v1.11):

(link to here)


PlayStation 3 (from 2008, with hwloc v1.11).
The old Linux kernel does not report cache information:

(link to here)


ARMv7 (from 2015, with hwloc v1.11):

(link to here)


IBM/S390 (from 2010, with hwloc v1.11)
Packages are grouped within Books (name of IBM mainframe modular cards):

(link to here)


Miscellaneous

ASCII Art output in case you don't have X11 available.
Generated with lstopo -.ascii with hwloc v1.11, or with lstopo -.txt in older releases:

(link to here)