Attached is some error output from my tests of 1-sided message passing, plus my info file.  Below are two copies of a simple fortran subroutine that mimics mpi_allgatherv using  mpi-get calls.  The top version fails, the bottom runs OK.  It seems clear from these examples, plus the 'self_send' phrases in the error output, that there is a problem internally with a processor sending data to itself.  I know that your 'mpi_get' implementation is simply a wrapper around 'send/recv' calls, so clearly this shouldn't happen.  However, the
problem does not happen in all cases; I tried to duplicate it in a simple stand-alone program with mpi_get calls and was unable to make it fail.  Go figure.

Tom


***************  This version fails, producing the error output attached ***********

        subroutine allgatherv_get(xrma,ijsiz,ijstrt,xx,iwinget)

      use mpinog
      implicit none
!
      real xrma(*) , xx(*)
      integer ijsiz(nproc) , ijstrt(nproc)
!
      integer iwinget , ierr , msgsize , j , nn,i
!
      itarget_disp = 0
!
! iwinget is 'handle' to array 'xrma'
!
      call mpi_win_fence(0,iwinget,ierr)
!
      do 200 j = 1,nproc
        nn = ijstrt(j) + 1
        msgsize = ijsiz(j)
!
! 'get' data from RMA window
!
          call mpi_get(xx(nn),msgsize,mpireal,j-1,itarget_disp,msgsize,
     &                 mpireal,iwinget,ierr)
!
  200 continue
!
      call mpi_win_fence(0,iwinget,ierr)
!
      return
      end
  

****************  This version runs   *****************************

        subroutine allgatherv_get(xrma,ijsiz,ijstrt,xx,iwinget)
!
      use mpinog
      implicit none
!
      real xrma(*) , xx(*)
      integer ijsiz(nproc) , ijstrt(nproc)
!
      integer iwinget , ierr , msgsize , j , nn,i
!
      itarget_disp = 0
!
! 'iwinget 'is 'handle' to array 'xrma'
! 'ir' is rank+1
!
      call mpi_win_fence(0,iwinget,ierr)
!
      do 200 j = 1,nproc
        nn = ijstrt(j) + 1
        msgsize = ijsiz(j)
!
        if(ir.ne.j)then
!
! if 'off-processor' then 'get' data from RMA window
!
          call mpi_get(xx(nn),msgsize,mpireal,j-1,itarget_disp,msgsize,
     &                 mpireal,iwinget,ierr)
        else
!
! for 'on-processor case, bypass MPI and do direct memory-memory copies
!
          do 120 i = 1,msgsize
            xx(i+ijstrt(j)) = xrma(i)
  120     continue
!
        endif
!
  200 continue
!
      call mpi_win_fence(0,iwinget,ierr)
!
      return
      end