Thank you for the feedback. I actually just changed the repeated probing for a message to a blocking MPI_RECV, as the processor waiting to receive does nothing but repeatedly probe until the message is there anyway. This also works, and it makes more sense to do it this way. However, this did not fix my hanging issue. I am wondering if it has something to do with the size of my buffer used in MPI_BUFFER_ATTACH. I believe I am following the proper MPI_BSEND_OVERHEAD protocol. I am waiting on the admins to install openmpi-1.6.3, and hoping that maybe this will fix my issue.
First off, 1.4.4 is fairly ancient. You might want to try upgrading to 1.6.3.
Second, you might want to use non-blocking receives for B such that you can MPI_WAITALL, or perhaps MPI_WAITSOME or MPI_WAITANY to wait for some/all of the values to arrive in B. This keeps any looping down in MPI (i.e., as close to the hardware as possible).
> The function 'probe_for_message' uses an 'MPI_IPROBE' to see if 'MPI_ANY_SOURCE' has a message for my current proc. If there is a message, the function returns a true logical and calls 'MPI_RECV', receiving (i0,j0,k0,this_sc) from the proc that has the message. This works! My concern is that I am probing repeatedly inside the while loop until I receive a message from a proc such that ii=i0, jj=j0, kk=k0. I could potentially call MPI_IPROBE many many times before this happens... and I'm worried that this is a messy way of doing this. Could I "break" the mpi probe call? Are there MPI routines that would allow me to accomplish the same thing in a more formal or safer way? Maybe a persistent communication or something? For very large computations with many procs, I am observing a hanging situation which I suspect may be due to this. I observe it when using openmpi-1.4.4, and the hanging seems to disappear if I use mvapich. Any suggestions/comments would be greatly ap!
On Jan 25, 2013, at 3:21 PM, Jeremy McCaslin <firstname.lastname@example.org> wrote:
> I am trying to figure out the most appropriate MPI calls for a certain portion of my code. I will describe the situation here:
> Each cell (i,j) of my array A is being updated by a calculation that depends on the values of 1 or 2 of the 4 possible neighbors A(i+1,j), A(i-1,j), A(i,j+1), and A(i,j-1). Say, for example, A(i,j)=A(i-1,j)*A(i,j-1). The thing is, the values of the neighbors A(i-1,j) and A(i,j-1) cannot be used until an auxiliary array B has been updated from 0 to 1. The values B(i-1,j) and B(i,j-1) are changed from 0 -> 1 after the values A(i-1,j) and A(i,j-1) have been communicated to the proc that contains cell (i,j), as cells (i-1,j) and (i,j-1) belong to different procs. Here is pseudocode for how I have the algorithm implemented (in fortran):
> do while (B(ii,jj,kk).eq.0)
> if (probe_for_message(i0,j0,k0,this_sc)) then
> end if
> end do
preciated. Thanks so much!> JM _______________________________________________
> devel mailing list
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
devel mailing list