Oops,
Le 06/09/2012 10:13, Gabriele Fatigati a écrit :Adding hwloc_topology_destroy() at the end of the file would likely remove most of them.Downsizing the array, up to 4GB,
valgrind gives many warnings reported in the attached file.
But that won't fix the problem since the leaks are small.
==28082== LEAK SUMMARY:
==28082== definitely lost: 4,080 bytes in 3 blocks
==28082== indirectly lost: 51,708 bytes in 973 blocks
==28082== possibly lost: 304 bytes in 1 blocks
==28082== still reachable: 1,786 bytes in 4 blocks
==28082== suppressed: 0 bytes in 0 blocks
I don't know where to look, sorry.
Brice
2012/9/6 Gabriele Fatigati <g.fatigati@cineca.it>
Sorry,
I used a wrong hwloc installation. Using the hwloc with the printf controls:
mbind hwloc_linux_set_area_membind() fails:
Error from HWLOC mbind: Cannot allocate memory
so this is the origin of bad allocation.
I attach the right valgrind output
valgrind --track-origins=yes --log-file=output_valgrind --leak-check=full --tool=memcheck --show-reachable=yes ./main_hybrid_bind_mem
2012/9/6 Gabriele Fatigati <g.fatigati@cineca.it>
Hi Brice, hi Jeff,
>Can you add some printf inside hwloc_linux_set_area_membind() in src/topology-linux.c to see if ENOMEM comes from the mbind >syscall or not?
I added printf inside that function, but ENOMEM does not come from there.
>Have you run your application through valgrind or another memory-checking debugger?
I tried with valgrind :
valgrind --track-origins=yes --log-file=output_valgrind --leak-check=full --tool=memcheck --show-reachable=yes ./main_hybrid_bind_mem
==25687== Warning: set address range perms: large range [0x39454040, 0x2218d4040) (undefined)==25687====25687== Valgrind's memory management: out of memory:==25687== newSuperblock's request for 4194304 bytes failed.==25687== 34253180928 bytes have already been allocated.==25687== Valgrind cannot continue. Sorry.
I attach the full output.
The code dies also using OpenMP pure code. Very misteriously.
2012/9/5 Jeff Squyres <jsquyres@cisco.com>
On Sep 5, 2012, at 2:36 PM, Gabriele Fatigati wrote:Mmm. Probably right.
> I don't think is a simply out of memory since NUMA node has 48 GB, and I'm allocating just 8 GB.
Have you run your application through valgrind or another memory-checking debugger?
I've seen cases of heap corruption lead to malloc incorrectly failing with ENOMEM.
--
Jeff Squyres
jsquyres@cisco.com
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
_______________________________________________
hwloc-users mailing list
hwloc-users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
--
Ing. Gabriele Fatigati
HPC specialist
SuperComputing Applications and Innovation Department
Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
www.cineca.it Tel: +39 051 6171722
g.fatigati [AT] cineca.it
--
Ing. Gabriele Fatigati
HPC specialist
SuperComputing Applications and Innovation Department
Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
www.cineca.it Tel: +39 051 6171722
g.fatigati [AT] cineca.it
--
Ing. Gabriele Fatigati
HPC specialist
SuperComputing Applications and Innovation Department
Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
www.cineca.it Tel: +39 051 6171722
g.fatigati [AT] cineca.it
_______________________________________________ hwloc-users mailing list hwloc-users@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
_______________________________________________
hwloc-users mailing list
hwloc-users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users