Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Looking for a replacement call for repeated call to MPI_IPROBE
From: Jeremy McCaslin (jomccaslin_at_[hidden])
Date: 2013-01-28 12:22:30


Thank you for the feedback. I actually just changed the repeated probing
for a message to a blocking MPI_RECV, as the processor waiting to receive
does nothing but repeatedly probe until the message is there anyway. This
also works, and it makes more sense to do it this way. However, this did
not fix my hanging issue. I am wondering if it has something to do with
the size of my buffer used in MPI_BUFFER_ATTACH. I believe I am following
the proper MPI_BSEND_OVERHEAD protocol. I am waiting on the admins to
install openmpi-1.6.3, and hoping that maybe this will fix my issue.

On Sat, Jan 26, 2013 at 7:32 AM, Jeff Squyres (jsquyres) <jsquyres_at_[hidden]
> wrote:

> First off, 1.4.4 is fairly ancient. You might want to try upgrading to
> 1.6.3.
>
> Second, you might want to use non-blocking receives for B such that you
> can MPI_WAITALL, or perhaps MPI_WAITSOME or MPI_WAITANY to wait for
> some/all of the values to arrive in B. This keeps any looping down in MPI
> (i.e., as close to the hardware as possible).
>
>
> On Jan 25, 2013, at 3:21 PM, Jeremy McCaslin <jomccaslin_at_[hidden]> wrote:
>
> > Hello,
> >
> > I am trying to figure out the most appropriate MPI calls for a certain
> portion of my code. I will describe the situation here:
> >
> > Each cell (i,j) of my array A is being updated by a calculation that
> depends on the values of 1 or 2 of the 4 possible neighbors A(i+1,j),
> A(i-1,j), A(i,j+1), and A(i,j-1). Say, for example,
> A(i,j)=A(i-1,j)*A(i,j-1). The thing is, the values of the neighbors
> A(i-1,j) and A(i,j-1) cannot be used until an auxiliary array B has been
> updated from 0 to 1. The values B(i-1,j) and B(i,j-1) are changed from 0
> -> 1 after the values A(i-1,j) and A(i,j-1) have been communicated to the
> proc that contains cell (i,j), as cells (i-1,j) and (i,j-1) belong to
> different procs. Here is pseudocode for how I have the algorithm
> implemented (in fortran):
> >
> > do while (B(ii,jj,kk).eq.0)
> > if (probe_for_message(i0,j0,k0,this_sc)) then
> > my_ibuf(1)=my_ibuf(1)+1
> > A(i0,j0,k0)=this_sc
> > B(i0,j0,k0)=1
> > end if
> > end do
> >
> > The function 'probe_for_message' uses an 'MPI_IPROBE' to see if
> 'MPI_ANY_SOURCE' has a message for my current proc. If there is a message,
> the function returns a true logical and calls 'MPI_RECV', receiving
> (i0,j0,k0,this_sc) from the proc that has the message. This works! My
> concern is that I am probing repeatedly inside the while loop until I
> receive a message from a proc such that ii=i0, jj=j0, kk=k0. I could
> potentially call MPI_IPROBE many many times before this happens... and I'm
> worried that this is a messy way of doing this. Could I "break" the mpi
> probe call? Are there MPI routines that would allow me to accomplish the
> same thing in a more formal or safer way? Maybe a persistent communication
> or something? For very large computations with many procs, I am observing
> a hanging situation which I suspect may be due to this. I observe it when
> using openmpi-1.4.4, and the hanging seems to disappear if I use mvapich.
> Any suggestions/comments would be greatly ap!
> preciated. Thanks so much!
> >
> > --
> > JM _______________________________________________
> > devel mailing list
> > devel_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> --
> Jeff Squyres
> jsquyres_at_[hidden]
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

-- 
JM