Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] Memory allocation error when linking with MPI libraries
From: Nicolas Deladerriere (nicolas.deladerriere_at_[hidden])
Date: 2010-08-06 09:05:20


Hello,

I'am having an sigsegv error when using simple program compiled and link
with openmpi.
I have reproduce the problem using really simple fortran code. It actually
does not even use MPI, but just link with mpi shared libraries. (problem
does not appear when I do not link with mpi libraries)
   % cat allocate.F90
   program test
   implicit none
       integer, dimension(:), allocatable :: z
       integer(kind=8) :: l

       write(*,*) "l ?"
       read(*,*) l

       ALLOCATE(z(l))
       z(1) = 111
       z(l) = 222
       DEALLOCATE(z)

   end program test

I am using openmpi 1.4.2 and gfortran for my tests. Here is the compilation
:

   % ./openmpi-1.4.2/build/bin/mpif90 --showme -g -o testallocate
allocate.F90
   gfortran -g -o testallocate allocate.F90
-I/s0/scr1/TOMOT_19311_HAL_/openmpi-1.4.2/build/include -pthread
-I/s0/scr1/TOMOT_19311_HAL_/openmpi-1.4.2/build/lib
-L/s0/scr1/TOMOT_19311_HAL_/openmpi-1.4.2/build/lib -lmpi_f90 -lmpi_f77
-lmpi -lopen-rte -lopen-pal -ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldl
-pthread

When I am running that test with different length, I sometimes get a
"Segmentation fault" error. Here are two examples using two specific values,
but error happens for many other values of length (I did not manage to find
which values of lenght gives that error)

   % ./testallocate
    l ?
   1600000000
   Segmentation fault
   % ./testallocate
    l ?
   2000000000

I used debugger with re-compiled version of openmpi using debug flag. I got
the folowing error in function sYSMALLOc

   Program received signal SIGSEGV, Segmentation fault.
   0x00002aaaab70b3b3 in sYSMALLOc (nb=6400000016, av=0x2aaaab930200) at
malloc.c:3239
   3239 set_head(remainder, remainder_size | PREV_INUSE);
   Current language: auto; currently c
   (gdb) bt
   #0 0x00002aaaab70b3b3 in sYSMALLOc (nb=6400000016, av=0x2aaaab930200) at
malloc.c:3239
   #1 0x00002aaaab70d0db in opal_memory_ptmalloc2_int_malloc
(av=0x2aaaab930200, bytes=6400000000) at malloc.c:4322
   #2 0x00002aaaab70b773 in opal_memory_ptmalloc2_malloc (bytes=6400000000)
at malloc.c:3435
   #3 0x00002aaaab70a665 in opal_memory_ptmalloc2_malloc_hook
(sz=6400000000, caller=0x2aaaabf8534d) at hooks.c:667
   #4 0x00002aaaabf8534d in _gfortran_internal_free () from
/usr/lib64/libgfortran.so.1
   #5 0x0000000000400bcc in MAIN__ () at allocate.F90:11
   #6 0x0000000000400c4e in main ()
   (gdb) display
   (gdb) list
   3234 if ((unsigned long)(size) >= (unsigned long)(nb + MINSIZE)) {
   3235 remainder_size = size - nb;
   3236 remainder = chunk_at_offset(p, nb);
   3237 av->top = remainder;
   3238 set_head(p, nb | PREV_INUSE | (av != &main_arena ?
NON_MAIN_ARENA : 0));
   3239 set_head(remainder, remainder_size | PREV_INUSE);
   3240 check_malloced_chunk(av, p, nb);
   3241 return chunk2mem(p);
   3242 }
   3243

I also did the same test in C and I got the same problem.

Does someone has any idea that could help me understand what's going on ?

Regards
Nicolas