Dear Jeff,
Perhaps you simply have run out of memory on that NUMA node, and therefore the malloc failed. Check "numactl --hardware", for example.
You might want to check the output of numastat to see if one or more of your NUMA nodes have run out of memory.
> <main_hybrid_bind_mem.c>_______________________________________________
On Sep 5, 2012, at 12:58 PM, Gabriele Fatigati wrote:
> I've reproduced the problem in a small MPI + OpenMP code.
>
> The error is the same: after some memory bind, gives "Cannot allocate memory".
>
> Thanks.
>
> 2012/9/5 Gabriele Fatigati <g.fatigati@cineca.it>
> Downscaling the matrix size, binding works well, but the memory available is enought also using more big matrix, so I'm a bit confused.
>
> Using the same big matrix size without binding the code works well, so how I can explain this behaviour?
>
> Maybe hwloc_set_area_membind_nodeset introduces other extra allocation that are resilient after the call?
>
>
>
> 2012/9/5 Brice Goglin <Brice.Goglin@inria.fr>
> An internal malloc failed then. That would explain why your malloc failed too.
> It looks like you malloc'ed too much memory in your program?
>
> Brice
>
>
>
>
> Le 05/09/2012 15:56, Gabriele Fatigati a écrit :
>> An update:
>>
>> placing strerror(errno) after hwloc_set_area_membind_nodeset gives: "Cannot allocate memory"
>>
>> 2012/9/5 Gabriele Fatigati <g.fatigati@cineca.it>
>> Hi,
>>
>> I've noted that hwloc_set_area_membind_nodeset return -1 but errno is not equal to EXDEV or ENOSYS. I supposed that these two case was the two unique possibly.
>>
>> From the hwloc documentation:
>>
>> -1 with errno set to ENOSYS if the action is not supported
>> -1 with errno set to EXDEV if the binding cannot be enforced
>>
>>
>> Any other binding failure reason? The memory available is enought.
>>
>> 2012/9/5 Brice Goglin <Brice.Goglin@inria.fr>
>> Hello Gabriele,
>>
>> The only limit that I would think of is the available physical memory on each NUMA node (numactl -H will tell you how much of each NUMA node memory is still available).
>> malloc usually only fails (it returns NULL?) when there no *virtual* memory anymore, that's different. If you don't allocate tons of terabytes of virtual memory, this shouldn't happen easily.
>>
>> Brice
>>
>>
>>
>>
>> Le 05/09/2012 14:27, Gabriele Fatigati a écrit :
>>> Dear Hwloc users and developers,
>>>
>>>
>>> I'm using hwloc 1.4.1 on a multithreaded program in a Linux platform, where each thread bind many non contiguos pieces of a big matrix using in a very intensive way hwloc_set_area_membind_nodeset function:
>>>
>>> hwloc_set_area_membind_nodeset(topology, punt+offset, len, nodeset, HWLOC_MEMBIND_BIND, HWLOC_MEMBIND_THREAD | HWLOC_MEMBIND_MIGRATE);
>>>
>>> Binding seems works well, since the returned code from function is 0 for every calls.
>>>
>>> The problems is that after binding, a simple little new malloc fails, without any apparent reason.
>>>
>>> Disabling memory binding, the allocations works well. Is there any knows problem if hwloc_set_area_membind_nodeset is used intensively?
>>>
>>> Is there some operating system limit for memory pages binding?
>>>
>>> Thanks in advance.
>>>
>>> --
>>> Ing. Gabriele Fatigati
>>>
>>> HPC specialist
>>>
>>> SuperComputing Applications and Innovation Department
>>>
>>> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>>>
>>> www.cineca.it Tel: +39 051 6171722
>>>
>>> g.fatigati [AT] cineca.it
>>>
>>>
>>> _______________________________________________
>>> hwloc-users mailing list
>>>
>>> hwloc-users@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>>
>>
>>
>>
>> --
>> Ing. Gabriele Fatigati
>>
>> HPC specialist
>>
>> SuperComputing Applications and Innovation Department
>>
>> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>>
>> www.cineca.it Tel: +39 051 6171722
>>
>> g.fatigati [AT] cineca.it
>>
>>
>>
>> --
>> Ing. Gabriele Fatigati
>>
>> HPC specialist
>>
>> SuperComputing Applications and Innovation Department
>>
>> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>>
>> www.cineca.it Tel: +39 051 6171722
>>
>> g.fatigati [AT] cineca.it
>
>
>
>
> --
> Ing. Gabriele Fatigati
>
> HPC specialist
>
> SuperComputing Applications and Innovation Department
>
> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>
> www.cineca.it Tel: +39 051 6171722
>
> g.fatigati [AT] cineca.it
>
>
>
> --
> Ing. Gabriele Fatigati
>
> HPC specialist
>
> SuperComputing Applications and Innovation Department
>
> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>
> www.cineca.it Tel: +39 051 6171722
>
> g.fatigati [AT] cineca.it
> hwloc-users mailing listJeff Squyres
> hwloc-users@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
--
jsquyres@cisco.com
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
_______________________________________________
hwloc-users mailing list
hwloc-users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users