Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Memory allocation error when linking with MPI libraries
From: Nysal Jan (jnysal_at_[hidden])
Date: 2010-08-08 12:02:56


What interconnect are you using? Infiniband? Use "--without-memory-manager"
option while building ompi in order to disable ptmalloc.

Regards
--Nysal

On Sun, Aug 8, 2010 at 7:49 PM, Nicolas Deladerriere <
nicolas.deladerriere_at_[hidden]> wrote:

> Yes, I'am using 24G machine on 64 bit Linux OS.
> If I compile without wrapper, I did not get any problems.
>
> It seems that when I am linking with openmpi, my program use a kind of
> openmpi implemented malloc. Is it possible to switch it off in order ot only
> use malloc from libc ?
>
> Nicolas
>
> 2010/8/8 Terry Frankcombe <terry_at_[hidden]>
>
> You're trying to do a 6GB allocate. Can your underlying system handle
>> that? IF you compile without the wrapper, does it work?
>>
>> I see your executable is using the OMPI memory stuff. IIRC there are
>> switches to turn that off.
>>
>>
>> On Fri, 2010-08-06 at 15:05 +0200, Nicolas Deladerriere wrote:
>> > Hello,
>> >
>> > I'am having an sigsegv error when using simple program compiled and
>> > link with openmpi.
>> > I have reproduce the problem using really simple fortran code. It
>> > actually does not even use MPI, but just link with mpi shared
>> > libraries. (problem does not appear when I do not link with mpi
>> > libraries)
>> > % cat allocate.F90
>> > program test
>> > implicit none
>> > integer, dimension(:), allocatable :: z
>> > integer(kind=8) :: l
>> >
>> > write(*,*) "l ?"
>> > read(*,*) l
>> >
>> > ALLOCATE(z(l))
>> > z(1) = 111
>> > z(l) = 222
>> > DEALLOCATE(z)
>> >
>> > end program test
>> >
>> > I am using openmpi 1.4.2 and gfortran for my tests. Here is the
>> > compilation :
>> >
>> > % ./openmpi-1.4.2/build/bin/mpif90 --showme -g -o testallocate
>> > allocate.F90
>> > gfortran -g -o testallocate allocate.F90
>> > -I/s0/scr1/TOMOT_19311_HAL_/openmpi-1.4.2/build/include -pthread
>> > -I/s0/scr1/TOMOT_19311_HAL_/openmpi-1.4.2/build/lib
>> > -L/s0/scr1/TOMOT_19311_HAL_/openmpi-1.4.2/build/lib -lmpi_f90
>> > -lmpi_f77 -lmpi -lopen-rte -lopen-pal -ldl -Wl,--export-dynamic -lnsl
>> > -lutil -lm -ldl -pthread
>> >
>> > When I am running that test with different length, I sometimes get a
>> > "Segmentation fault" error. Here are two examples using two specific
>> > values, but error happens for many other values of length (I did not
>> > manage to find which values of lenght gives that error)
>> >
>> > % ./testallocate
>> > l ?
>> > 1600000000
>> > Segmentation fault
>> > % ./testallocate
>> > l ?
>> > 2000000000
>> >
>> > I used debugger with re-compiled version of openmpi using debug flag.
>> > I got the folowing error in function sYSMALLOc
>> >
>> > Program received signal SIGSEGV, Segmentation fault.
>> > 0x00002aaaab70b3b3 in sYSMALLOc (nb=6400000016, av=0x2aaaab930200)
>> > at malloc.c:3239
>> > 3239 set_head(remainder, remainder_size | PREV_INUSE);
>> > Current language: auto; currently c
>> > (gdb) bt
>> > #0 0x00002aaaab70b3b3 in sYSMALLOc (nb=6400000016,
>> > av=0x2aaaab930200) at malloc.c:3239
>> > #1 0x00002aaaab70d0db in opal_memory_ptmalloc2_int_malloc
>> > (av=0x2aaaab930200, bytes=6400000000) at malloc.c:4322
>> > #2 0x00002aaaab70b773 in opal_memory_ptmalloc2_malloc
>> > (bytes=6400000000) at malloc.c:3435
>> > #3 0x00002aaaab70a665 in opal_memory_ptmalloc2_malloc_hook
>> > (sz=6400000000, caller=0x2aaaabf8534d) at hooks.c:667
>> > #4 0x00002aaaabf8534d in _gfortran_internal_free ()
>> > from /usr/lib64/libgfortran.so.1
>> > #5 0x0000000000400bcc in MAIN__ () at allocate.F90:11
>> > #6 0x0000000000400c4e in main ()
>> > (gdb) display
>> > (gdb) list
>> > 3234 if ((unsigned long)(size) >= (unsigned long)(nb +
>> > MINSIZE)) {
>> > 3235 remainder_size = size - nb;
>> > 3236 remainder = chunk_at_offset(p, nb);
>> > 3237 av->top = remainder;
>> > 3238 set_head(p, nb | PREV_INUSE | (av != &main_arena ?
>> > NON_MAIN_ARENA : 0));
>> > 3239 set_head(remainder, remainder_size | PREV_INUSE);
>> > 3240 check_malloced_chunk(av, p, nb);
>> > 3241 return chunk2mem(p);
>> > 3242 }
>> > 3243
>> >
>> >
>> > I also did the same test in C and I got the same problem.
>> >
>> > Does someone has any idea that could help me understand what's going
>> > on ?
>> >
>> > Regards
>> > Nicolas
>> >
>> > _______________________________________________
>> > users mailing list
>> > users_at_[hidden]
>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>