Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] Scalability issue
From: Benjamin Toueg (btoueg_at_[hidden])
Date: 2010-12-02 09:36:23


Hi,

I am using DRAGON, a neutronic simulation code in FORTRAN77 that has its own
datastructures. I added a module to send these data structures thanks to
MPI_SEND / MPI_RECEIVE, and everything worked perfectly for a while.

Then I had to raise the number of data structures to be sent up to a point
where my cluster has this bug :
*** Process received signal ***
Signal: Segmentation fault (11)
Signal code: Address not mapped (1)
Failing at address: 0x2c2579fc0
[ 0] /lib/libpthread.so.0 [0x7f52d2930410]
[ 1] /home/toueg/openmpi/lib/openmpi/mca_pml_ob1.so [0x7f52d153fe03]
[ 2] /home/toueg/openmpi/lib/libmpi.so.0(PMPI_Recv+0x2d2) [0x7f52d3504a1e]
[ 3] /home/toueg/openmpi/lib/libmpi_f77.so.0(pmpi_recv_+0x10e)
[0x7f52d36cf9c6]

*How can I make this error more explicit ?*

I use the following configuration of openmpi-1.4.3 :
./configure --enable-debug --prefix=/home/toueg/openmpi CXX=g++ CC=gcc
F77=gfortran FC=gfortran FLAGS="-m64 -fdefault-integer-8 -fdefault-real-8
-fdefault-double-8" FCFLAGS="-m64 -fdefault-integer-8 -fdefault-real-8
-fdefault-double-8" --disable-mpi-f90

Here is the output of mpif77 -v :
mpif77 for 1.2.7 (release) of : 2005/11/04 11:54:51
Driving: f77 -L/usr/lib/mpich-mpd/lib -v -lmpich-p4mpd -lpthread -lrt
-lfrtbegin -lg2c -lm -shared-libgcc
Lecture des spécification à partir de
/usr/lib/gcc/x86_64-linux-gnu/3.4.6/specs
Configuré avec: ../src/configure -v --enable-languages=c,c++,f77,pascal
--prefix=/usr --libexecdir=/usr/lib
--with-gxx-include-dir=/usr/include/c++/3.4 --enable-shared
--with-system-zlib --enable-nls --without-included-gettext
--program-suffix=-3.4 --enable-__cxa_atexit --enable-clocale=gnu
--enable-libstdcxx-debug x86_64-linux-gnu
Modèle de thread: posix
version gcc 3.4.6 (Debian 3.4.6-5)
 /usr/lib/gcc/x86_64-linux-gnu/3.4.6/collect2 --eh-frame-hdr -m elf_x86_64
-dynamic-linker /lib64/ld-linux-x86-64.so.2
/usr/lib/gcc/x86_64-linux-gnu/3.4.6/../../../../lib/crt1.o
/usr/lib/gcc/x86_64-linux-gnu/3.4.6/../../../../lib/crti.o
/usr/lib/gcc/x86_64-linux-gnu/3.4.6/crtbegin.o -L/usr/lib/mpich-mpd/lib
-L/usr/lib/gcc/x86_64-linux-gnu/3.4.6 -L/usr/lib/gcc/x86_64-linux-gnu/3.4.6
-L/usr/lib/gcc/x86_64-linux-gnu/3.4.6/../../../../lib
-L/usr/lib/gcc/x86_64-linux-gnu/3.4.6/../../.. -L/lib/../lib
-L/usr/lib/../lib -lmpich-p4mpd -lpthread -lrt -lfrtbegin -lg2c -lm -lgcc_s
-lgcc -lc -lgcc_s -lgcc /usr/lib/gcc/x86_64-linux-gnu/3.4.6/crtend.o
/usr/lib/gcc/x86_64-linux-gnu/3.4.6/../../../../lib/crtn.o
/usr/lib/gcc/x86_64-linux-gnu/3.4.6/../../../../lib/libfrtbegin.a(frtbegin.o):
dans la fonction â–’ main â–’:
(.text+0x1e): référence indéfinie vers ▒ MAIN__ ▒
collect2: ld a retourné 1 code d'état d'exécution

Thanks,
Benjamin