Open MPI logo

Hardware Locality Users' Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Users mailing list

Subject: Re: [hwloc-users] Thread binding problem
From: Brice Goglin (Brice.Goglin_at_[hidden])
Date: 2012-09-06 12:54:27


Le 06/09/2012 14:51, Gabriele Fatigati a écrit :
> Hi Brice,
>
> the initial grep is:
>
> numa_policy 65671 65952 24 144 1 : tunables 120 60
> 8 : slabdata 458 458 0
>
> When set_membind fails is:
>
> numa_policy 482 1152 24 144 1 : tunables 120 60
> 8 : slabdata 8 8 288
>
> What does it means?

The first number is the number of active objects. That means 65000
mempolicy objects were in use on the first line.
(I wonder if you swapped the lines, I expected higher numbers at the end
of the run)

Anyway, having 65000 mempolicies in use is a lot. And that would somehow
correspond to the number of set_area_membind that succeeed before one
fails. So the kernel might indeed fail to merge those.

That said, these objects are small (24bytes here if I am reading things
correctly), so we're talking about 1,6MB only here. So there's still
something else eating all the memory. /proc/meminfo (MemFree) and
numactl -H should again help.

Brice

>
>
>
> 2012/9/6 Brice Goglin <Brice.Goglin_at_[hidden]
> <mailto:Brice.Goglin_at_[hidden]>>
>
> Le 06/09/2012 12:19, Gabriele Fatigati a écrit :
>> I did't find any strange number in /proc/meminfo.
>>
>> I've noted that the program fails exactly
>> every 65479 hwloc_set_area_membind. So It sounds like some kernel
>> limit. You can check that also just one thread.
>>
>> Maybe never has not noted them because usually we bind a large
>> amount of contiguos memory few times, instead of small and non
>> contiguos pieces of memory many and many times.. :(
>
> If you have root access, try (as root)
> watch -n 1 grep numa_policy /proc/slabinfo
> Put a sleep(10) in your program when set_area_membind() fails, and
> don't let your program exit before you can read the content of
> /proc/slabinfo.
>
> Brice
>
>
>
>
>>
>> 2012/9/6 Brice Goglin <Brice.Goglin_at_[hidden]
>> <mailto:Brice.Goglin_at_[hidden]>>
>>
>> Le 06/09/2012 10:44, Samuel Thibault a écrit :
>> > Gabriele Fatigati, le Thu 06 Sep 2012 10:12:38 +0200, a écrit :
>> >> mbind hwloc_linux_set_area_membind() fails:
>> >>
>> >> Error from HWLOC mbind: Cannot allocate memory
>> > Ok. mbind is not really supposed to allocate much memory,
>> but it still
>> > does allocate some, to record the policy
>> >
>> >> // hwloc_obj_t obj =
>> hwloc_get_obj_by_type(topology, HWLOC_OBJ_NODE, tid);
>> >> hwloc_obj_t obj = hwloc_get_obj_by_type(topology,
>> HWLOC_OBJ_PU, tid);
>> >> hwloc_cpuset_t cpuset = hwloc_bitmap_dup(obj->cpuset);
>> >> hwloc_bitmap_singlify(cpuset);
>> >> hwloc_set_cpubind(topology, cpuset,
>> HWLOC_CPUBIND_THREAD);
>> >>
>> >> for( i = chunk*tid; i < len; i+=PAGE_SIZE) {
>> >> // res =
>> hwloc_set_area_membind_nodeset(topology, &array[i],
>> PAGE_SIZE, obj->nodeset, HWLOC_MEMBIND_BIND,
>> HWLOC_MEMBIND_THREAD);
>> >> res = hwloc_set_area_membind(topology,
>> &array[i], PAGE_SIZE, cpuset, HWLOC_MEMBIND_BIND,
>> HWLOC_MEMBIND_THREAD);
>> > and I'm afraid that calling set_area_membind for each page
>> might be too
>> > dense: the kernel is probably allocating a memory policy
>> record for each
>> > page, not being able to merge adjacent equal policies.
>> >
>>
>> It's supposed to merge VMA with same policies (from what I
>> understand in
>> the code), but I don't know if that actually works.
>> Maybe Gabriele found a kernel bug :)
>>
>> Brice
>>
>> _______________________________________________
>> hwloc-users mailing list
>> hwloc-users_at_[hidden] <mailto:hwloc-users_at_[hidden]>
>> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>>
>>
>>
>>
>> --
>> Ing. Gabriele Fatigati
>>
>> HPC specialist
>>
>> SuperComputing Applications and Innovation Department
>>
>> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>>
>> www.cineca.it <http://www.cineca.it> Tel:
>> +39 051 6171722 <tel:%2B39%20051%206171722>
>>
>> g.fatigati [AT] cineca.it <http://cineca.it>
>>
>>
>> _______________________________________________
>> hwloc-users mailing list
>> hwloc-users_at_[hidden] <mailto:hwloc-users_at_[hidden]>
>> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>
>
> _______________________________________________
> hwloc-users mailing list
> hwloc-users_at_[hidden] <mailto:hwloc-users_at_[hidden]>
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>
>
>
>
> --
> Ing. Gabriele Fatigati
>
> HPC specialist
>
> SuperComputing Applications and Innovation Department
>
> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>
> www.cineca.it <http://www.cineca.it> Tel: +39 051
> 6171722
>
> g.fatigati [AT] cineca.it <http://cineca.it>
>
>
> _______________________________________________
> hwloc-users mailing list
> hwloc-users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users