Open MPI logo

Hardware Locality Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Development mailing list

Subject: Re: [hwloc-devel] bug in hwloc 1.1 hwloc_get_membind_nodeset on Linux
From: Bernd Kallies (kallies_at_[hidden])
Date: 2011-01-18 13:38:29


On Tue, 2011-01-18 at 19:11 +0100, Brice Goglin wrote:
> Le 18/01/2011 17:40, Bernd Kallies a écrit :
> > Hallo,
> >
> > I'm using hwloc-1.1 on Linux 2.6.32.19 (x86_64) on a machine that has
> > several NUMA nodes. It seems to me that there are unwanted bits left in
> > the nodeset "set", when calling hwloc_get_membind_nodeset(topo,
> > set, ...) after a successful hwloc_set_membind() or
> > hwloc_set_membind_nodeset().
> >
> > E.g. when binding to node 2, then calling
> > hwloc_get_membind_nodeset(topo, set), then stringifying set, I get
> > something like 2,516,518. The machine has 64 nodes and 1024 pus.
> >
> > It seems that the following resolves the problem:
> >
> > --- src/topology-linux.c.bak 2010-11-25 15:01:48.000000000 +0100
> > +++ src/topology-linux.c 2011-01-18 17:38:18.000000000 +0100
> > @@ -886,18 +886,13 @@
> > static void
> > hwloc_linux_membind_mask_to_nodeset(hwloc_topology_t topology __hwloc_attribute_unused,
> > hwloc_nodeset_t nodeset,
> > - unsigned _max_os_index, const unsigned long *linuxmask)
> > + unsigned max_os_index, const unsigned long *linuxmask)
> > {
> > - unsigned max_os_index;
> > unsigned i;
> >
> > - /* round up to the nearest multiple of BITS_PER_LONG */
> > - max_os_index = (_max_os_index + HWLOC_BITS_PER_LONG) & ~(HWLOC_BITS_PER_LONG - 1);
> > -
> > hwloc_bitmap_zero(nodeset);
> > for(i=0; i<max_os_index/HWLOC_BITS_PER_LONG; i++)
> > hwloc_bitmap_set_ith_ulong(nodeset, i, linuxmask[i]);
> > - /* if we don't trust the kernel, we could clear bits from _max_os_index+1 to max_os_index-1 */
> > }
> > #endif /* HWLOC_HAVE_SET_MEMPOLICY || HWLOC_HAVE_MBIND */
> >
>
> Hello Bernd,
>
> I would like to understand better what's going on here.
> What's CONFIG_NODES_SHIFT in your kernel config?

/proc/config.gz says CONFIG_NODES_SHIFT=9

> Can you print max_os_index above (without round-up)?

max_os_index = 512, HWLOC_BITS_PER_LONG = 64, rounding gives
max_os_index = 576.

I also saw the same behaviour on a much smaller machine (usual 2-socket
Nehalem-EP). CONFIG_NODES_SHIFT is not found in /proc/config.gz.
max_os_index = 64, HWLOC_BITS_PER_LONG = 64, rounding gives max_os_index
= 128.

Hope this helps. BK

> Brice
>

-- 
Dr. Bernd Kallies
Konrad-Zuse-Zentrum für Informationstechnik Berlin
Takustr. 7
14195 Berlin
Tel: +49-30-84185-270
Fax: +49-30-84185-311
e-mail: kallies_at_[hidden]