Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Collective operations and synchronization
From: Shaun Jackman (sjackman_at_[hidden])
Date: 2009-03-23 19:36:41

Hi Ralph,

Thanks for your response. My problem is removing all leaf nodes from a
directed graph, which is distributed among a number of processes. Each
process iterates over its portion of the graph, and if a node is a
leaf (indegree(n) == 0 || outdegree(n) == 0), it removes the node
(which is a local operation) and notifies each of its neighbours
(using MPI_Ibsend) to remove any edges incident to the removed node.
If that node becomes a leaf, it is also removed and the process
cascades. I use the following algorithm to check if this cascade
process is complete:

loop {
MPI_Ibsend (for every edge of every leaf node)
MPI_Iprobe/MPI_Recv (until no messages pending)
MPI_Allreduce (number of nodes removed)
} until (no nodes removed by any node)

Previously, I attempted to use a single MPI_Allreduce without the

loop {
MPI_Iprobe/MPI_Recv (until no messages pending)
MPI_Ibsend (for every edge of every leaf node)
MPI_Allreduce (number of nodes removed)
} until (no nodes removed by any node)

This latter algorithm did not complete correctly. Now that I've
written out the algorithm in pseudo-code, it looks a little clearer.
There must be a race condition between the MPI_Iprobe and MPI_Recv. I
wonder if using MPI_Irecv would clear it up.


Ralph Castain wrote:
> I think perhaps you folks are all caught up a tad too much in the
> standard and not reading the intent of someone's question... :-)
> I believe the original question was concerned with ensuring that all
> procs had completed MPI_Allreduce before his algorithm attempted other
> operations. As you folks know, procs can leave MPI_Allreduce at
> significantly different times. Using an MPI_Barrier after
> MPI_Allreduce would accomplish the questioner's objective.
> Whether or not the questioner's particular program really -needs- to
> do that is another matter - one I personally wouldn't attempt to
> answer without knowing a lot more about what that next step after
> MPI_Allreduce does.