Open MPI logo

Hardware Locality Users' Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Users mailing list

Subject: [hwloc-users] hwloc-bind --get on Solaris for binding to a single core
From: Eugene Loh (eugene.loh_at_[hidden])
Date: 2013-02-06 01:49:05


This is on a Solaris 11 system with hwloc 1.6.1:

% lstopo-no-graphics
Machine (4095MB) + NUMANode L#0 (P#0 4095MB) + Socket L#0
   Core L#0 + PU L#0 (P#0)
   Core L#1 + PU L#1 (P#1)
   Core L#2 + PU L#2 (P#2)
   Core L#3 + PU L#3 (P#3)
% hwloc-bind socket:0.pu:1 hwloc-bind --get
0x0000000f

I assume that output is wrong. I bind to a single core, but the returned mask shows binding to all four cores.

To confirm that binding is indeed happening and that it's the reporting that's incorrect:

% hwloc-bind socket:0.pu:0 pbind -q
process id 1773: 0
% hwloc-bind socket:0.pu:1 pbind -q
process id 1774: 1
% hwloc-bind socket:0.pu:2 pbind -q
process id 1775: 2
% hwloc-bind socket:0.pu:3 pbind -q
process id 1776: 3

It seems to me the problem is in topology-solaris.c. In hwloc_solaris_set_sth_cpubind(), we can bind to a single core with
processor_bind(), which is what's happening in our case. Then, in hwloc_solaris_get_sth_cpubind(), we check for lgroup affinity but
not for any processor_bind() binding. So, we assume we're not bound.

How about adding a check upon entry to hwloc_solaris_get_sth_cpubind(): if processor_bind() shows binding, report this and be done.
  If not, then continue on with the lgroup logic that's already in that function. Yes?