Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] MPI_BCAST and fortran subarrays
From: Gustavo Correa (gus_at_[hidden])
Date: 2011-12-12 12:38:18


Hi Patrick

I think tab(i,:) is not contiguous in memory, but has a stride of nbcpus.
Since the MPI type you are passing is just the barebones MPI_INTEGER,
MPI_BCAST expects the four integers to be contiguous in memory, I guess.
The MPI calls don't have any idea of the Fortran90 memory layout,
and the tab(i,:) that you pass to MPI_BCAST means only the address for the *first*
MPI_INTEGER to be broadcast (sent and received).

My impression is that you could either:
1) Declare your table transposed, i.e, tab(4,nbcpus-1),
and make a few adjustments in the code
to adapt to this change, which would make tab(:,i) contiguous in memory.
or
2) Keep your current declaration of 'tab', but declare an MPI_TYPE_VECTOR with
the appropriate stride (nbcpus) and use it in your MPI_BCAST call.

For MPI user defined types see Ch. 3 of "MPI, The Complete Reference, Vol.1, The MPI Core, 2nd Ed." by M. Snir et. al.

I hope this helps,
Gus Correa

On Dec 12, 2011, at 10:35 AM, Patrick Begou wrote:

> I've got a strange problem with Fortran90 and MPI_BCAST call on a large application. I've isolated the problem in this short program samples.
> With fortran we can use subarrays in functions calls. Example, with passing a subarray to the "change" procedure:
>
> MODULE mymod
> IMPLICIT NONE
> CONTAINS
> SUBROUTINE change(tab,i)
> IMPLICIT NONE
> INTEGER, INTENT(INOUT),DIMENSION(:)::tab
> INTEGER, INTENT(IN) :: i
> tab(:)=i
> END SUBROUTINE change
> END MODULE mymod
>
> PROGRAM toto
> USE mymod
> IMPLICIT NONE
> INTEGER, PARAMETER::nx=6, ny=4
> INTEGER, DIMENSION(nx,ny):: tab
> INTEGER::i
>
> tab=-1
> DO i=1,nx
> CALL change(tab(i,:),i)
> ENDDO
> PRINT*,tab
> END PROGRAM toto
>
> But If I use subarrays with MPI_BCAST() like in this example:
>
> PROGRAM bide
> USE mpi
> IMPLICIT NONE
> INTEGER :: nbcpus
> INTEGER :: my_rank
> INTEGER :: ierr,i,buf
> INTEGER, ALLOCATABLE:: tab(:,:)
>
> CALL MPI_INIT(ierr)
> CALL MPI_COMM_RANK(MPI_COMM_WORLD,my_rank,ierr)
> CALL MPI_COMM_SIZE(MPI_COMM_WORLD,nbcpus,ierr)
>
> ALLOCATE (tab(0:nbcpus-1,4))
>
> tab(:,:)=-1
> tab(my_rank,:)=my_rank
> DO i=0,nbcpus-1
> CALL MPI_BCAST(tab(i,:),4,MPI_INTEGER,i,MPI_COMM_WORLD,ierr)
> ENDDO
> IF (my_rank .EQ. 0) print*,tab
> CALL MPI_FINALIZE(ierr)
>
> END PROGRAM bide
>
> It doesn't work! With openMPI 1.2.8 (OpenSuse 11.4 X86_64) I have random segfault: it works sometime, with few cpus (2, 4, 8...) and does'nt work sometime with a larger number of cpus (32, 48, 64...). With OpenMPI 1.4.4 (build from sources) it hangs (most of the array tab remains at the -1 initialization value).
> Such procedure calls are allowed with fortran90 so I do not understand why they fail here. I have to use a buffer array (called tabl in the following program) to solve the problem.
>
> PROGRAM bide
> USE mpi
> IMPLICIT NONE
> INTEGER :: nbcpus
> INTEGER :: my_rank
> INTEGER :: ierr,i,buf
> INTEGER, ALLOCATABLE:: tab(:,:)
> INTEGER::tab1(4)
>
> CALL MPI_INIT(ierr)
> CALL MPI_COMM_RANK(MPI_COMM_WORLD,my_rank,ierr)
> CALL MPI_COMM_SIZE(MPI_COMM_WORLD,nbcpus,ierr)
>
> ALLOCATE (tab(0:nbcpus-1,4))
>
> tab=-1
> tab1=-1
> DO i=0,nbcpus-1
> IF(my_rank.EQ.i) tab1=my_rank
> CALL MPI_BCAST(tab1,4,MPI_INTEGER,i,MPI_COMM_WORLD,ierr)
> tab(i,:)=tab1
> ENDDO
> IF (my_rank .EQ. 0) print*,tab
> CALL MPI_FINALIZE(ierr)
>
> END PROGRAM bide
>
> Any idea about this behavior ?
>
> Patrick
> --
> ===============================================================
> | Equipe M.O.S.T. |
> http://most.hmg.inpg.fr
> |
> | Patrick BEGOU | ------------ |
> | LEGI |
> mailto:Patrick.Begou_at_[hidden]
> |
> | BP 53 X | Tel 04 76 82 51 35 |
> | 38041 GRENOBLE CEDEX | Fax 04 76 82 52 71 |
> ===============================================================
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users