Shaun Jackman wrote:
> Eugene Loh wrote:
>> Shaun Jackman wrote:
>>> For my MPI application, each process reads a file and for each line
>>> sends a message (MPI_Send) to one of the other processes determined
>>> by the contents of that line. Each process posts a single MPI_Irecv
>>> and uses MPI_Request_get_status to test for a received message. If a
>>> message has been received, it processes the message and posts a new
>>> MPI_Irecv. I believe this situation is not safe and prone to
>>> deadlock since MPI_Send may block. The receiver would need to post
>>> as many MPI_Irecv as messages it expects to receive, but it does not
>>> know in advance how many messages to expect from the other
>>> processes. How is this situation usually handled in an MPI
>>> appliation where the number of messages to receive is unknown?
>> Each process posts an MPI_Irecv to listen for in-coming messages.
>> Each process enters a loop in which it reads its file and sends out
>> messages. Within this loop, you also loop on MPI_Test to see if any
>> message has arrived. If so, process it, post another MPI_Irecv(),
>> and keep polling. (I'd use MPI_Test rather than
>> MPI_Request_get_status since you'll have to call something like
>> MPI_Test anyhow to complete the receive.)
>> Once you've posted all your sends, send out a special message to
>> indicate you're finished. I'm thinking of some sort of tree
>> fan-in/fan-out barrier so that everyone will know when everyone is
>> Keep polling on MPI_Test, processing further receives or advancing
>> your fan-in/fan-out barrier.
>> So, the key ingredients are:
>> *) keep polling on MPI_Test and reposting MPI_Irecv calls to drain
>> in-coming messages while you're still in your "send" phase
>> *) have another mechanism for processes to notify one another when
>> they've finished their send phases
> Hi Eugene,
> Very astute. You've pretty much exactly described how it works now,
> particularly the loop around MPI_Test and MPI_Irecv to drain incoming
> messages. So, here's my worry, which I'll demonstrate with an example.
> We have four processes. Each calls MPI_Irecv once. Each reads one line
> of its file. Each sends one message with MPI_Send to some other
> process based on the line that it has read, and then goes into the
> MPI_Test/MPI_Irecv loop.
> The events fall out in this order
> 2 sends to 0 and does not block (0 has one MPI_Irecv posted)
> 3 sends to 1 and does not block (1 has one MPI_Irecv posted)
> 0 receives the message from 2, consuming its MPI_Irecv
> 1 receives the message from 3, consuming its MPI_Irecv
> 0 sends to 1 and blocks (1 has no more MPI_Irecv posted)
> 1 sends to 0 and blocks (0 has no more MPI_Irecv posted)
> and now processes 0 and 1 are deadlocked.
> When I say `receives' above, I mean that Open MPI has received the
> message and copied it into the buffer passed to the MPI_Irecv call,
> but the application hasn't yet called MPI_Test. The next step would be
> for all the processes to call MPI_Test, but 0 and 1 are already
I don't get it. Processes should drain aggressively. So, if 0 receives
a message, it should immediately post the next MPI_Irecv. Before 0
posts a send, it should MPI_Test (and post the next MPI_Irecv if the
test received a message).
Further, you could convert to MPI_Isend.
But maybe I'm missing something.