Dear list members
I am using openmpi 1.3.3 with OFED on a HP
cluster with redhatLinux.
Occasionally (not always) I get a crash
with the following message:
[hydra11:09312] *** Process
received signal ***
[hydra11:09312] Signal:
Segmentation fault (11)
[hydra11:09312] Signal code:
Address not mapped (1)
[hydra11:09312] Failing at
address: 0xffffffffab5f30a8
[hydra11:09312] [ 0]
/lib64/libpthread.so.0 [0x3c1400e4c0]
[hydra11:09312] [ 1]
/home/ipl/openmpi-1.3.3/platforms/hp/lib/libmpi.so.0(MPI_Isend+0x93)
[0x2af1be45a3e3]
[hydra11:09312] [ 2]
./flow(MP_SendReal+0x60) [0x6bc993]
[hydra11:09312] [ 3] ./flow(SendRealsAlongFaceWithOffset_3D+0x4ab)
[0x68ba19]
[hydra11:09312] [ 4]
./flow(MP_SendVertexArrayBlock+0x23d) [0x6891e1]
[hydra11:09312] [ 5]
./flow(MB_CommAllVertex+0x65) [0x6848ba]
[hydra11:09312] [ 6]
./flow(MB_SetupVertexArray+0xd5) [0x68c837]
[hydra11:09312] [ 7]
./flow(MB_SetupGrid+0xa8) [0x68be51]
[hydra11:09312] [ 8]
./flow(SetGrid+0x58) [0x446224]
[hydra11:09312] [ 9]
./flow(main+0x148) [0x43b728]
[hydra11:09312] [10]
/lib64/libc.so.6(__libc_start_main+0xf4) [0x3c1341d974]
[hydra11:09312] [11]
./flow(__gxx_personality_v0+0xd9) [0x429b19]
[hydra11:09312] *** End of
error message ***
--------------------------------------------------------------------------
mpirun noticed that process
rank 6 with PID 9312 on node hydra11 exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
The crash does not appear always –
sometimes the application runs fine. However, it seems that the crash
especially occurs when I run on more than 1 node.
I have consulted the archive of open-mpi
and have found many error messages of the same kind, but none from the 1.3.3
version, and none of direct relevance.
I would really appreciate comments on this.
Below is the information required according to the openmpi web,
Config.log: attached (config.zip)
Open mpi was configured with prefix and
with the path to openib, and with the following compiler flags
setenv CC gcc
setenv CFLAGS '-O'
setenv CXX g++
setenv CXXFLAGS '-O'
setenv F77 'gfortran'
setenv FFLAGS '-O'
ompi_info –all:
attached
The application (named flow) was launched
on hydra11 by
nohup mpirun –H hydra11,hydra12
–np 8 ./flow caseC.in &
the PATH and LD_LIBRARY_PATH, hydra11 and
hydra12:
PATH=/home/ipl/openmpi-1.3.3/platforms/hp/bin
LD_LIBRARY_PATH= /home/ipl/openmpi-1.3.3/platforms/hp/lib
OpenFabrics version: 1.4
Linux:
X86_64-redhat-linux/3.4.6
ibv_devinfo, hydra11: attached
ibv_devinfo, hydra12: attached
ifconfig, hydra11: attached
ifconfig, hydra12: attached
ulimit –l (hydra11): 6000000
ulimit –l (hydra12): unlimited
Furthermore, I can say that I have not
specified any MCA parameters.
The application which I am running
(named flow) is linked from fortran, c and c++ libraries with the
following:
/home/ipl/openmpi-1.3.3/platforms/hp/bin/mpicc
-DMP -DNS3_ARCH_LINUX -DLAPACK -I/home/ipl/ns3/engine/include_forLinux
-I/home/ipl/openmpi-1.3.3/platforms/hp/include -c -o
user_small_3D.o user_small_3D.c
rm -f flow
/home/ipl/openmpi-1.3.3/platforms/hp/bin/mpicxx
-o flow user_small_3D.o -L/home/ipl/ns3/engine/lib_forLinux
-lns3main -lns3pars -lns3util -lns3vofl -lns3turb -lns3solv -lns3mesh -lns3diff
-lns3grid -lns3line -lns3data -lns3base -lfitpack -lillusolve -lfftpack_small
-lfenton -lns3air -lns3dens -lns3poro -lns3sedi -llapack_small -lblas_small -lm
-lgfortran
/home/ipl/ns3/engine/lib_Tecplot_forLinux/tecio64.a
Please let me
know if you need more info!
Thanks in advance,
Iris Lohmann
|
|
||
|
Iris
Pernille Lohmann |
||
|
MSc,
PhD |
||
|
Ports
& Offshore Technology (POT) |
||
|
|
||
|
|
||
|
|
||
|
DHI |
||
|
Agern
Allé 5 |
||
|
DK-2970
Hørsholm |
||
|
Denmark |
||
|
|
||
|
Tel: |
|
+45 4516 9200 |
|
Direct:
|
|
45169427 |
|
|
||
|
ipl@dhigroup.com |
||
|
www.dhigroup.com |
||
|
|
||
|
WATER • ENVIRONMENT • HEALTH |
||
***************************************************************************** ** ** ** WARNING: This email contains an attachment of a very suspicious type. ** ** You are urged NOT to open this attachment unless you are absolutely ** ** sure it is legitimate. Opening this attachment may cause irreparable ** ** damage to your computer and your files. If you have any questions ** ** about the validity of this message, PLEASE SEEK HELP BEFORE OPENING IT. ** ** ** ** This warning was added by the IU Computer Science Dept. mail scanner. ** *****************************************************************************