Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Receiving MPI messages of unknown size
From: Gus Correa (gus_at_[hidden])
Date: 2009-06-03 21:55:59


Hi Lars

I wonder if you could always use blocking message passing on the
preliminary send/receive pair that transmits the message size/header,
then use non-blocking mode for the actual message.
If the "message size/header" part transmits a small buffer,
the preliminary send/recv pair will use the "eager" communication mode,
return quickly, and may not reduce performance, I would guess.

For a group of several messages the preliminary
send/recv pair could transmit a small (to ensure "eager mode")
array of message sizes,
maybe along with the message tags and sender ranks,
instead of only one size.

Just a thought.

Gus Correa
---------------------------------------------------------------------
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------

Lars Andersson wrote:
> Hi,
>
> I'm trying to solve a problem of passing serializable, arbitrarily
> sized objects around using MPI and non-blocking communication. The
> problem I'm facing is what to do at the receiving end when expecting
> an object of unknown size, but at the same time not block on waiting
> for it.
>
> When using blocking message passing, I have simply solved the problem
> by first sending a small, fixed size header containing the size of
> rest of the data, sent in the following mpi message. When using
> non-blocking message passing, this doesn't seem to be such a good
> idea, since we cant post the main data transfer until we have received
> the message header... It seems to take away most of the advantages on
> non-blocking io in the first place.
>
>
> I've been thinking about solving this using MPI_Probe / MPI_IProbe,
> but I'm worried about performance.
>
>
> Question 1:
>
> Will MPI_Probe or the underlying MPI implementation actually receive
> the full message data (assuming reasonably sized message, like less
> than 10MB) before MPI_Probe returns? Or will there be a significant
> data transfer delay (for large messages) when calling MPI_Recv after a
> successful MPI_Probe?
>
>
>
> What I want is something like this:
>
> 1) post one or several non-blocking, variable sized message receives
>
> 2) do other, non-MPI work, while any incoming messages will be fully
> received into
> buffers on the local machine.
>
> 3) perform completion of the receives posted in 1). I don't want to
> unnecessarily
> wait here for data transfers that could have taken place during 2).
>
>
> Problems:
>
> I can't post non-blocking MPI_Irecv() calls in 1, because I don't know
> the sizes of incoming messages.
>
> If I simply do nothing in 1, and call MPI_Probe in 3, I'm worried that
> I won't get nice compute/transfer overlap because the messages wont
> actually be received locally until I post a Probe or Recv in 3.
>
>
> Question 2:
>
> How can I achieve the communication sequence described in 1,2,3 above,
> with overlapping data transfer and local computation during 2?
>
>
> Question 3:
>
> A temporary kludge solution to the problem above might be to allocate
> a temporary receive buffer of some arbitrary, constant maximum size
> BUFSIZE in 1 for each non-blocking receive operation, make sure
> messages sent are not larger than BUFSIZE, and post MPI_Irecv(buffer,
> BUFSIZE,...) calls in 1. I haven't been able to figure out if it's
> actually correct and portable to receive less data than specified in
> the count argument to MPI_Irecv.
>
> What if the message sent on the other end is 10 bytes, and
> BUFSIZE=count=20. Would that be OK?
>
>
> If anyone can shed any light on this, I'd be grateful. FYI, we're
> using a cluster of 2-8 core x86-64 machines running Linux and
> connected using ordinary 1Gbit ethernet.
>
>
> Best regards,
>
> Lars Andersson
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users