Thanks for your reply.
There are no core files associated with the crash. Based on your answer, and the fact that the crash only appears occasionally, I think I need to debug more carefully as you suggest - it may very well be something not working completely right in the application.
Thanks again, and thanks for all the help which is passed on through this list - it is very helpful and a lot of work.
From: users-bounces_at_[hidden] [mailto:users-bounces_at_[hidden]] On Behalf Of Jeff Squyres
Sent: 03 November 2009 03:19
To: Open MPI Users
Subject: Re: [OMPI users] segmentation fault: Address not mapped
Many thanks for all this information. Unfortunately, it's not enough
to know what's going on. :-(
Do you know for sure that the application is correct? E.g., is it
possible that a bad buffer is being passed to MPI_Isend? I note that
it is fairly odd to fail in MPI_Isend itself because that function is
actually pretty short -- it mainly checks parameters and then calls a
back-end Open MPI function to actually do the send.
Do you get corefiles with the killed processes, and can you analyze
where the application failed? If so, can you verify that all state in
the application appears to be correct? It might be helpful to analyze
exactly where the application failed (e.g., compile at least ompi/mpi/
c/isend.c with the -g flag so that you can get some debugging
information about exactly where in MPI_Isend it failed -- like I said,
it's a short function that mainly checks parameters). You might want
to have your application double check all the parameters that are
passed to MPI_Isend, too.
On Oct 26, 2009, at 9:43 AM, Iris Pernille Lohmann wrote:
> Dear list members
> I am using openmpi 1.3.3 with OFED on a HP cluster with redhatLinux.
> Occasionally (not always) I get a crash with the following message:
> [hydra11:09312] *** Process received signal ***
> [hydra11:09312] Signal: Segmentation fault (11)
> [hydra11:09312] Signal code: Address not mapped (1)
> [hydra11:09312] Failing at address: 0xffffffffab5f30a8
> [hydra11:09312] [ 0] /lib64/libpthread.so.0 [0x3c1400e4c0]
> [hydra11:09312] [ 1] /home/ipl/openmpi-1.3.3/platforms/hp/lib/
> libmpi.so.0(MPI_Isend+0x93) [0x2af1be45a3e3]
> [hydra11:09312] [ 2] ./flow(MP_SendReal+0x60) [0x6bc993]
> [hydra11:09312] [ 3] ./flow(SendRealsAlongFaceWithOffset_3D+0x4ab)
> [hydra11:09312] [ 4] ./flow(MP_SendVertexArrayBlock+0x23d) [0x6891e1]
> [hydra11:09312] [ 5] ./flow(MB_CommAllVertex+0x65) [0x6848ba]
> [hydra11:09312] [ 6] ./flow(MB_SetupVertexArray+0xd5) [0x68c837]
> [hydra11:09312] [ 7] ./flow(MB_SetupGrid+0xa8) [0x68be51]
> [hydra11:09312] [ 8] ./flow(SetGrid+0x58) [0x446224]
> [hydra11:09312] [ 9] ./flow(main+0x148) [0x43b728]
> [hydra11:09312]  /lib64/libc.so.6(__libc_start_main+0xf4)
> [hydra11:09312]  ./flow(__gxx_personality_v0+0xd9) [0x429b19]
> [hydra11:09312] *** End of error message ***
> mpirun noticed that process rank 6 with PID 9312 on node hydra11
> exited on signal 11 (Segmentation fault).
> The crash does not appear always - sometimes the application runs
> fine. However, it seems that the crash especially occurs when I run
> on more than 1 node.
> I have consulted the archive of open-mpi and have found many error
> messages of the same kind, but none from the 1.3.3 version, and none
> of direct relevance.
> I would really appreciate comments on this. Below is the information
> required according to the openmpi web,
> Config.log: attached (config.zip)
> Open mpi was configured with prefix and with the path to openib, and
> with the following compiler flags
> setenv CC gcc
> setenv CFLAGS '-O'
> setenv CXX g++
> setenv CXXFLAGS '-O'
> setenv F77 'gfortran'
> setenv FFLAGS '-O'
> ompi_info -all:
> The application (named flow) was launched on hydra11 by
> nohup mpirun -H hydra11,hydra12 -np 8 ./flow caseC.in &
> the PATH and LD_LIBRARY_PATH, hydra11 and hydra12:
> LD_LIBRARY_PATH= /home/ipl/openmpi-1.3.3/platforms/hp/lib
> OpenFabrics version: 1.4
> ibv_devinfo, hydra11: attached
> ibv_devinfo, hydra12: attached
> ifconfig, hydra11: attached
> ifconfig, hydra12: attached
> ulimit -l (hydra11): 6000000
> ulimit -l (hydra12): unlimited
> Furthermore, I can say that I have not specified any MCA parameters.
> The application which I am running (named flow) is linked from
> fortran, c and c++ libraries with the following:
> /home/ipl/openmpi-1.3.3/platforms/hp/bin/mpicc -DMP -
> DNS3_ARCH_LINUX -DLAPACK -I/home/ipl/ns3/engine/include_forLinux -I/
> home/ipl/openmpi-1.3.3/platforms/hp/include -c -o user_small_3D.o
> rm -f flow
> /home/ipl/openmpi-1.3.3/platforms/hp/bin/mpicxx -o flow
> user_small_3D.o -L/home/ipl/ns3/engine/lib_forLinux -lns3main -
> lns3pars -lns3util -lns3vofl -lns3turb -lns3solv -lns3mesh -lns3diff
> -lns3grid -lns3line -lns3data -lns3base -lfitpack -lillusolve -
> lfftpack_small -lfenton -lns3air -lns3dens -lns3poro -lns3sedi -
> llapack_small -lblas_small -lm -lgfortran /home/ipl/ns3/engine/
> Please let me know if you need more info!
> Thanks in advance,
> Iris Lohmann
> Iris Pernille Lohmann
> MSc, PhD
> Ports & Offshore Technology (POT)
> Agern Allé 5
> DK-2970 Hørsholm
> +45 4516 9200
> WATER . ENVIRONMENT . HEALTH
> ** **
> ** WARNING: This email contains an attachment of a very suspicious
> type. **
> ** You are urged NOT to open this attachment unless you are
> absolutely **
> ** sure it is legitimate. Opening this attachment may cause
> irreparable **
> ** damage to your computer and your files. If you have any
> questions **
> ** about the validity of this message, PLEASE SEEK HELP BEFORE
> OPENING IT. **
> ** **
> ** This warning was added by the IU Computer Science Dept. mail
> scanner. **
> users mailing list
users mailing list