It sounds like you don't have a balance between sends and recvs somewhere - i.e., some apps send messages, but the intended recipient isn't issuing a recv and waiting until the message has been received before exiting. If the recipient leaves before the isend completes, then the isend will never complete and the waitall will not return.
On Apr 4, 2014, at 5:20 PM, Ross Boylan <ross_at_[hidden]> wrote:
> During shutdown of my application the processes issue a waitall, since they have done some Isends. A couple of them never return from that call.
> Could this be the result of some of the processes already being shutdown (the processes with the problem were late in the shutdown sequence)? If so, what is the recommended solution? A barrier?
> The shutdown proceeds in stages, but the processes in question are not told to shutdown until all the messages they have sent have been received. So there shouldn't be any outstanding messages from them.
> My reading of the manual is that Waitall with a count of 0 should return immediately, not hang. Is that correct?
> Running under R with openmpi 1.7.4.
> users mailing list