Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Receiving an unknown number of messages
From: Shaun Jackman (sjackman_at_[hidden])
Date: 2009-07-23 16:51:27

Eugene Loh wrote:
> Shaun Jackman wrote:
>> For my MPI application, each process reads a file and for each line
>> sends a message (MPI_Send) to one of the other processes determined by
>> the contents of that line. Each process posts a single MPI_Irecv and
>> uses MPI_Request_get_status to test for a received message. If a
>> message has been received, it processes the message and posts a new
>> MPI_Irecv. I believe this situation is not safe and prone to deadlock
>> since MPI_Send may block. The receiver would need to post as many
>> MPI_Irecv as messages it expects to receive, but it does not know in
>> advance how many messages to expect from the other processes. How is
>> this situation usually handled in an MPI appliation where the number
>> of messages to receive is unknown?
> Each process posts an MPI_Irecv to listen for in-coming messages.
> Each process enters a loop in which it reads its file and sends out
> messages. Within this loop, you also loop on MPI_Test to see if any
> message has arrived. If so, process it, post another MPI_Irecv(), and
> keep polling. (I'd use MPI_Test rather than MPI_Request_get_status
> since you'll have to call something like MPI_Test anyhow to complete the
> receive.)
> Once you've posted all your sends, send out a special message to
> indicate you're finished. I'm thinking of some sort of tree
> fan-in/fan-out barrier so that everyone will know when everyone is finished.
> Keep polling on MPI_Test, processing further receives or advancing your
> fan-in/fan-out barrier.
> So, the key ingredients are:
> *) keep polling on MPI_Test and reposting MPI_Irecv calls to drain
> in-coming messages while you're still in your "send" phase
> *) have another mechanism for processes to notify one another when
> they've finished their send phases

Hi Eugene,

Very astute. You've pretty much exactly described how it works now,
particularly the loop around MPI_Test and MPI_Irecv to drain incoming
messages. So, here's my worry, which I'll demonstrate with an example.
We have four processes. Each calls MPI_Irecv once. Each reads one line
of its file. Each sends one message with MPI_Send to some other
process based on the line that it has read, and then goes into the
MPI_Test/MPI_Irecv loop.

The events fall out in this order
2 sends to 0 and does not block (0 has one MPI_Irecv posted)
3 sends to 1 and does not block (1 has one MPI_Irecv posted)
0 receives the message from 2, consuming its MPI_Irecv
1 receives the message from 3, consuming its MPI_Irecv
0 sends to 1 and blocks (1 has no more MPI_Irecv posted)
1 sends to 0 and blocks (0 has no more MPI_Irecv posted)
and now processes 0 and 1 are deadlocked.

When I say `receives' above, I mean that Open MPI has received the
message and copied it into the buffer passed to the MPI_Irecv call,
but the application hasn't yet called MPI_Test. The next step would be
for all the processes to call MPI_Test, but 0 and 1 are already