Open MPI logo

Hardware Locality Users' Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Users mailing list

Subject: Re: [hwloc-users] Thread binding problem
From: Gabriele Fatigati (g.fatigati_at_[hidden])
Date: 2012-09-05 12:58:17


I've reproduced the problem in a small MPI + OpenMP code.

The error is the same: after some memory bind, gives "Cannot allocate
memory".

Thanks.

2012/9/5 Gabriele Fatigati <g.fatigati_at_[hidden]>

> Downscaling the matrix size, binding works well, but the memory available
> is enought also using more big matrix, so I'm a bit confused.
>
> Using the same big matrix size without binding the code works well, so how
> I can explain this behaviour?
>
> Maybe hwloc_set_area_membind_nodeset introduces other extra allocation
> that are resilient after the call?
>
>
>
> 2012/9/5 Brice Goglin <Brice.Goglin_at_[hidden]>
>
>> An internal malloc failed then. That would explain why your malloc
>> failed too.
>> It looks like you malloc'ed too much memory in your program?
>>
>> Brice
>>
>>
>>
>>
>> Le 05/09/2012 15:56, Gabriele Fatigati a écrit :
>>
>> An update:
>>
>> placing strerror(errno) after hwloc_set_area_membind_nodeset gives:
>> "Cannot allocate memory"
>>
>> 2012/9/5 Gabriele Fatigati <g.fatigati_at_[hidden]>
>>
>>> Hi,
>>>
>>> I've noted that hwloc_set_area_membind_nodeset return -1 but errno is
>>> not equal to EXDEV or ENOSYS. I supposed that these two case was the two
>>> unique possibly.
>>>
>>> From the hwloc documentation:
>>>
>>> -1 with errno set to ENOSYS if the action is not supported
>>> -1 with errno set to EXDEV if the binding cannot be enforced
>>>
>>>
>>> Any other binding failure reason? The memory available is enought.
>>>
>>> 2012/9/5 Brice Goglin <Brice.Goglin_at_[hidden]>
>>>
>>>> Hello Gabriele,
>>>>
>>>> The only limit that I would think of is the available physical memory
>>>> on each NUMA node (numactl -H will tell you how much of each NUMA node
>>>> memory is still available).
>>>> malloc usually only fails (it returns NULL?) when there no *virtual*
>>>> memory anymore, that's different. If you don't allocate tons of terabytes
>>>> of virtual memory, this shouldn't happen easily.
>>>>
>>>> Brice
>>>>
>>>>
>>>>
>>>>
>>>> Le 05/09/2012 14:27, Gabriele Fatigati a écrit :
>>>>
>>>> Dear Hwloc users and developers,
>>>>
>>>>
>>>> I'm using hwloc 1.4.1 on a multithreaded program in a Linux platform,
>>>> where each thread bind many non contiguos pieces of a big matrix using in a
>>>> very intensive way hwloc_set_area_membind_nodeset function:
>>>>
>>>> hwloc_set_area_membind_nodeset(topology, punt+offset, len, nodeset,
>>>> HWLOC_MEMBIND_BIND, HWLOC_MEMBIND_THREAD | HWLOC_MEMBIND_MIGRATE);
>>>>
>>>> Binding seems works well, since the returned code from function is 0
>>>> for every calls.
>>>>
>>>> The problems is that after binding, a simple little new malloc fails,
>>>> without any apparent reason.
>>>>
>>>> Disabling memory binding, the allocations works well. Is there any
>>>> knows problem if hwloc_set_area_membind_nodeset is used intensively?
>>>>
>>>> Is there some operating system limit for memory pages binding?
>>>>
>>>> Thanks in advance.
>>>>
>>>> --
>>>> Ing. Gabriele Fatigati
>>>>
>>>> HPC specialist
>>>>
>>>> SuperComputing Applications and Innovation Department
>>>>
>>>> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>>>>
>>>> www.cineca.it Tel: +39 051 6171722<%2B39%20051%206171722>
>>>>
>>>> g.fatigati [AT] cineca.it
>>>>
>>>>
>>>> _______________________________________________
>>>> hwloc-users mailing listhwloc-users_at_[hidden]http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Ing. Gabriele Fatigati
>>>
>>> HPC specialist
>>>
>>> SuperComputing Applications and Innovation Department
>>>
>>> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>>>
>>> www.cineca.it Tel: +39 051 6171722<%2B39%20051%206171722>
>>>
>>> g.fatigati [AT] cineca.it
>>>
>>
>>
>>
>> --
>> Ing. Gabriele Fatigati
>>
>> HPC specialist
>>
>> SuperComputing Applications and Innovation Department
>>
>> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>>
>> www.cineca.it Tel: +39 051 6171722
>>
>> g.fatigati [AT] cineca.it
>>
>>
>>
>
>
> --
> Ing. Gabriele Fatigati
>
> HPC specialist
>
> SuperComputing Applications and Innovation Department
>
> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>
> www.cineca.it Tel: +39 051 6171722
>
> g.fatigati [AT] cineca.it
>

-- 
Ing. Gabriele Fatigati
HPC specialist
SuperComputing Applications and Innovation Department
Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
www.cineca.it                    Tel:   +39 051 6171722
g.fatigati [AT] cineca.it