Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: shen T.T. (life_floating_at_[hidden])
Date: 2006-07-20 01:42:41


I have the same error message:"forrtl: severe (174): SIGSEGV, segmentation fault occurred". I run my paralled code on single node or multi nodes, the error existes. Then i try three Intel compilers : 8.1.037, 9.0.032 and 9.1.033 , but the error still existes. But my code work correctly on Windows XP with Visual Fortran 6.6. I doubt whether the intel compiler has some bugs. I also try to solve the problem, maybe the bug is in my code.
   
  Do you have the other compiler? Could you check the error and report it ?
   
  T.T. Shen
  

Frank Gruellich <frank.gruellich_at_[hidden]> »¡¡G
  Hi,

I'm running OFED 1.0 with OpenMPI 1.1b1-1 compiled for Intel Compiler
9.1. I get this error message during an MPI_Alltoall call:

Signal:11 info.si_errno:0(Success) si_code:1(SEGV_MAPERR)
Failing at addr:0x1cd04fe0
[0] func:/usr/ofed/mpi/intel/openmpi-1.1b1-1/lib64/libopal.so.0 [0x2b56964acc75]
[1] func:/lib64/libpthread.so.0 [0x2b569739b140]
[2] func:/software/intel/fce/9.1.032/lib/libirc.so(__intel_new_memcpy+0x1540) [0x2b5697278cf0]
*** End of error message ***

and have no idea about the problem. It arises if I exceed a specific
number (10) of MPI nodes. The error occures in this code:

do i = 1,npuntos
print *,'puntos',i
tam = 2**(i-1)
tmin = 1e5
tavg = 0.0d0
do j = 1,rep
envio = 8.0d0*j
call mpi_barrier(mpi_comm_world,ierr)
time1 = mpi_wtime()
do k = 1,rep2
call mpi_alltoall(envio,tam,mpi_byte,recibe,tam,mpi_byte,mpi_comm_world,ierr)
end do
call mpi_barrier(mpi_comm_world,ierr)
time2 = mpi_wtime()
time = (time2 - time1)/(rep2)
if (time < tmin) tmin = time
tavg = tavg + time
end do
m_tmin(i) = tmin
m_tavg(i) = tavg/rep
end do

this code is said to be running on another system (running IBGD 1.8.x).
I already tested mpich_mlx_intel-0.9.7_mlx2.1.0-1, but get a similar
error message when using 13 nodes:

forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
libpthread.so.0 00002B65DA39B140 Unknown Unknown Unknown
main.out 0000000000448BDB Unknown Unknown Unknown
[9] Registration failed, file : intra_rdma_alltoall.c, line : 163
[6] Registration failed, file : intra_rdma_alltoall.c, line : 163
9 - MPI_ALLTOALL : Unknown error
[9] [] Aborting Program!
6 - MPI_ALLTOALL : Unknown error
[6] [] Aborting Program!
[2] Registration failed, file : intra_rdma_alltoall.c, line : 163
[11] Registration failed, file : intra_rdma_alltoall.c, line : 163
11 - MPI_ALLTOALL : Unknown error
[11] [] Aborting Program!
2 - MPI_ALLTOALL : Unknown error
[2] [] Aborting Program!
[10] Registration failed, file : intra_rdma_alltoall.c, line : 163
10 - MPI_ALLTOALL : Unknown error
[10] [] Aborting Program!
[5] Registration failed, file : intra_rdma_alltoall.c, line : 163
5 - MPI_ALLTOALL : Unknown error
[5] [] Aborting Program!
[3] Registration failed, file : intra_rdma_alltoall.c, line : 163
[8] Registration failed, file : intra_rdma_alltoall.c, line : 163
3 - MPI_ALLTOALL : Unknown error
[3] [] Aborting Program!
8 - MPI_ALLTOALL : Unknown error
[8] [] Aborting Program!
[4] Registration failed, file : intra_rdma_alltoall.c, line : 163
4 - MPI_ALLTOALL : Unknown error
[4] [] Aborting Program!
[7] Registration failed, file : intra_rdma_alltoall.c, line : 163
7 - MPI_ALLTOALL : Unknown error
[7] [] Aborting Program!
[0] Registration failed, file : intra_rdma_alltoall.c, line : 163
0 - MPI_ALLTOALL : Unknown error
[0] [] Aborting Program!
[1] Registration failed, file : intra_rdma_alltoall.c, line : 163
1 - MPI_ALLTOALL : Unknown error
[1] [] Aborting Program!

I don't know wether this is a problem with MPI or Intel Compiler.
Please, can anybody point me in the right direction what I could have
done wrong? This is my first post (so be gentle) and at this time I'm
not very used to the verbosity of this list, so if you need any further
informations do not hesitate do request them.

Thanks in advance and kind regards,

-- 
Frank Gruellich
HPC-Techniker
Tel.: +49 3722 528 42
Fax: +49 3722 528 15
E-Mail: frank.gruellich_at_[hidden]
MEGWARE Computer GmbH
Vertrieb und Service
Nordstrasse 19
09247 Chemnitz/Roehrsdorf
Germany
http://www.megware.com/
_______________________________________________
users mailing list
users_at_[hidden]
http://www.open-mpi.org/mailman/listinfo.cgi/users
 ___________________________________________________ 
 ³Ì·sª© Yahoo!©_¼¯§Y®É³q°T 7.0¡A§K¶Oºô¸ô¹q¸Ü¥ô§A¥´¡I 
 http://messenger.yahoo.com.tw/