Open MPI logo

Hardware Locality Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Development mailing list

Subject: Re: [hwloc-devel] [hwloc-announce] Hardware locality (hwloc) v1.1rc1 released
From: Jirka Hladky (jhladky_at_[hidden])
Date: 2010-11-11 07:08:00


On Thursday, November 11, 2010 11:11:31 am Brice Goglin wrote:
> Le 11/11/2010 02:31, Samuel Thibault a écrit :
> >> get_mempolicy: Invalid argument
> >> hwloc_get_membind failed (errno 22 Invalid argument)
> >
> > Could you try to increase the value of max_os_index?
> >
> > I can see in the kernel source code the following in sys_get_mempolicy:
> > if (nmask != NULL && maxnode < MAX_NUMNODES)
> >
> > return -EINVAL;
> >
> > and MAX_NUMNODES depends on .config ...
>
> And indeed MAX_NUMNODES is (1<<CONFIG_NODES_SHIFT) and
> CONFIG_NODES_SHIFT=9 on rhel6 kernels. We pass a single ulong to the
> kernel, so it's not large enough to store 1<<9 bits. We couldn't
> reproduce on Debian and RHEL5 since NODE_SHIFT=6 there.
>
> We had to loop until we found the kernel NR_CPUS for sched_getaffinity,
> we can do the same to find the kernel MAX_NUMNODES for get_mempolicy.
> The attached patch may help. Only slightly tested obviously since I
> don't have any kernel causing the problem.
>
> Brice

Hi Brice,

thanks for the quick patch. I have tested it and it works! :-)

$ utils/hwloc-bind --membind node:1 --mempolicy interleave -- utils/hwloc-bind
--get --membind
0x0000aaaa (interleave)

I have couple of questions:
1) Does the option --get works together with --pid ? Like finding out mempolicy
for any pid? I don't think that get_mempolicy supports this. We can perhaps
enhance the parsing to raise an error when --pid and --get are both specified.

2) This might be a dumb question - I have tried --get on my laptop which is
running Fedora-12. It's one socket system with NUMA enabled - there is however
only node#0. I know that it's nonsense. But still, you can use this to run
some tests

I'm quite puzzled by the following output:

$utils/hwloc-bind --membind node:0 --mempolicy interleave -- utils/hwloc-bind
--get --membind
0xf...f (interleave)

What does "0xf...f" mean?

3) Just a small hint. Fedora 12 is using almost the same kernel as RHEL-6.

Thanks for looking into this!!!

Cheers
Jirka