Thanks all for your anwers. yes, I understand well that it is a non
contiguous memory access problem as the MPI_BCAST should wait for a
pointer on a valid memory zone. But I'm surprised that with the MPI
module usage Fortran does not hide this discontinuity in a
contiguous temporary copy of the array. I've spent some time to
build openMPI with g++/gcc/ifort (to create the right mpi module)
and ran some additional tests:
Default OpenMPI is openmpi-1.2.8-17.4.x86_64
# module load openmpi
# mpif90 ess.F90 && mpirun -np 4 ./a.out
0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3
# module unload openmpi
The result is Ok but sometime it hangs (when I require are a lot of processes)
With OpenMPI 1.4.4 and gfortran from gcc-fortran-4.5-19.1.x86_64
# module load openmpi-1.4.4-gcc-gfortran
# mpif90 ess.F90 && mpirun -np 4 ./a.out
0 -1 -1 -1 0 -1 -1 -1 0 -1 -1 -1 0 -1 -1 -1
# module unload openmpi-1.4.4-gcc-gfortran
Node 0 only update the global array with it's subarray. (i only print node 0 result)
With OpenMPI 1.4.4 and ifort 10.1.018 (yes, it's quite old, I have the latest one but it isn't installed!)
# module load openmpi-1.4.4-gcc-intel
# mpif90 ess.F90 && mpirun -np 4 ./a.out
ess.F90(15): (col. 5) remark: LOOP WAS VECTORIZED.
0 -1 -1 -1 0 -1
-1 -1 0 -1 -1 -1
0 -1 -1 -1
# mpif90 -check arg_temp_created ess.F90 && mpirun -np 4 ./a.out
gives a lot of messages like:
forrtl: warning (402): fort: (1): In call to MPI_BCAST1DI4, an array temporary was created for argument #1
So a temporary array is created for each call. So where is the problem ?
About the fortran compiler, I'm using similar behavior (non contiguous subarrays) in MPI_sendrecv calls and all is working fine: I ran some intensive tests from 1 to 128 processes on my quad-core workstation. This Fortran solution was easier than creating user defined data types.
Can you reproduce this behavior with the test case ? What are your OpenMPI and Gfortran/ifort versions ?
Thanks again
Patrick
The test code:
PROGRAM bide
USE mpi
IMPLICIT NONE
INTEGER :: nbcpus
INTEGER :: my_rank
INTEGER :: ierr,i,buf
INTEGER, ALLOCATABLE:: tab(:,:)
CALL MPI_INIT(ierr)
CALL MPI_COMM_RANK(MPI_COMM_WORLD,my_rank,ierr)
CALL MPI_COMM_SIZE(MPI_COMM_WORLD,nbcpus,ierr)
ALLOCATE (tab(0:nbcpus-1,4))
tab(:,:)=-1
tab(my_rank,:)=my_rank
DO i=0,nbcpus-1
CALL
MPI_BCAST(tab(i,:),4,MPI_INTEGER,i,MPI_COMM_WORLD,ierr)
ENDDO
IF (my_rank .EQ. 0) print*,tab
CALL MPI_FINALIZE(ierr)
END PROGRAM bide
-- ===============================================================
| Equipe M.O.S.T. | http://most.hmg.inpg.fr |
| Patrick BEGOU | ------------ |
| LEGI | mailto:Patrick.Begou@hmg.inpg.fr |
| BP 53 X | Tel 04 76 82 51 35 |
| 38041 GRENOBLE CEDEX | Fax 04 76 82 52 71 |
===============================================================