Open MPI logo

Hardware Locality Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Development mailing list

Subject: [hwloc-devel] hwloc-distrib: how to start at lower hiearchy level?
From: Jirka Hladky (jhladky_at_[hidden])
Date: 2010-07-04 20:41:07


Hi all,

I'm using hwloc-distrib quite often to distribute jobs optimally on NUMA
boxes. I use it to test linux kernel task - scheduler by comparing runtime of
jobs bound to best possible CPU configuration (keeping CPU cache in mind) with
runs without CPU affinity set.

I just run into strange issue on box with newest Intel's Nehalem CPUs. There
are 4 Sockets, each with 8 physical cores and hyper-threading enabled, which
gives you 64 OS processors.

The box has strange NUMA layout - I will need to check why it is so.
Basically, there are 3 NUMA nodes - one includes 2 Sockets, other 2 have one
Socket associated to each of it.

hwloc-distrib --single 8 will distribute jobs in the following way:
3 jobs on NUMANode #0
3 jobs on NUMANode #1
2 jobs on NUMANode #2

lstopo 64.pdf
for A in $(hwloc-distrib --single 8); do taskset ${A} sleep 100 & done
lstopo --top top.pdf

hwloc-distrib does it in fact right but this is not what I want. It's not the
best configuration when you consider CPU cache!

I have figured-out following way how to tell hwloc-distrib to avoid using
NUMANodes when computing CPU distribution:

lstopo --ignore NUMANode No_NUMA.xml
for A in $(hwloc-distrib --xml No_NUMA.xml --single 8); do taskset ${A} sleep
100 & done
lstopo --top fix.pdf

I'm wondering if there is a better way how to make "Socket" the top object.
Something like:
hwloc-distrib --ignore NUMANode --single 8
or
hwloc-distrib --top_level Socket --single 8

would be very useful. Is there something like this already? If not would you
consider this as an enhancement?

Thanks!
Jirka



  • application/pdf attachment: fix.pdf

  • application/pdf attachment: top.pdf

  • application/pdf attachment: 64.pdf