Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] efficient strategy with temporary message copy
From: christophe petit (christophe.petit09_at_[hidden])
Date: 2014-03-17 14:18:23


Thanks Jeff, I understand better the different cases and how to choose as a
function of the situation

2014-03-17 16:31 GMT+01:00 Jeff Squyres (jsquyres) <jsquyres_at_[hidden]>:

> On Mar 16, 2014, at 10:24 PM, christophe petit <
> christophe.petit09_at_[hidden]> wrote:
>
> > I am studying the optimization strategy when the number of communication
> functions in a codeis high.
> >
> > My courses on MPI say two things for optimization which are
> contradictory :
> >
> > 1*) You have to use temporary message copy to allow non-blocking sending
> and uncouple the sending and receiving
>
> There's a lot of schools of thought here, and the real answer is going to
> depend on your application.
>
> If the message is "short" (and the exact definition of "short" depends on
> your platform -- it varies depending on your CPU, your memory, your
> CPU/memory interconnect, ...etc.), then copying to a pre-allocated bounce
> buffer is typically a good idea. That lets you keep using your "real"
> buffer and not have to wait until communication is done.
>
> For "long" messages, the equation is a bit different. If "long" isn't
> "enormous", you might be able to have N buffers available, and simply work
> on 1 of them at a time in your main application and use the others for
> ongoing non-blocking communication. This is sometimes called "shadow"
> copies, or "ghost" copies.
>
> Such shadow copies are most useful when you receive something each
> iteration, for example. For example, something like this:
>
> buffer[0] = malloc(...);
> buffer[1] = malloc(...);
> current = 0;
> while (still_doing_iterations) {
> MPI_Irecv(buffer[current], ..., &req);
> /// work on buffer[current - 1]
> MPI_Wait(req, MPI_STATUS_IGNORE);
> current = 1 - current;
> }
>
> You get the idea.
>
> > 2*) Avoid using temporary message copy because the copy will add extra
> cost on execution time.
>
> It will, if the memcpy cost is significant (especially compared to the
> network time to send it). If the memcpy is small/insignificant, then don't
> worry about it.
>
> You'll need to determine where this crossover point is, however.
>
> Also keep in mind that MPI and/or the underlying network stack will likely
> be doing these kinds of things under the covers for you. Indeed, if you
> send short messages -- even via MPI_SEND -- it may return "immediately",
> indicating that MPI says it's safe for you to use the send buffer. But
> that doesn't mean that the message has even actually left the current
> server and gone out onto the network yet (i.e., some other layer below you
> may have just done a memcpy because it was a short message, and the
> processing/sending of that message is still ongoing).
>
> > And then, we are adviced to do :
> >
> > - replace MPI_SEND by MPI_SSEND (synchroneous blocking sending) : it is
> said that execution is divided by a factor 2
>
> This very, very much depends on your application.
>
> MPI_SSEND won't return until the receiver has started to receive the
> message.
>
> For some communication patterns, putting in this additional level of
> synchronization is helpful -- it keeps all MPI processes in tighter
> synchronization and you might experience less jitter, etc. And therefore
> overall execution time is faster.
>
> But for others, it adds unnecessary delay.
>
> I'd say it's an over-generalization that simply replacing MPI_SEND with
> MPI_SSEND always reduces execution time by 2.
>
> > - use MPI_ISSEND and MPI_IRECV with MPI_WAIT function to synchronize
> (synchroneous non-blocking sending) : it is said that execution is divided
> by a factor 3
>
> Again, it depends on the app. Generally, non-blocking communication is
> better -- *if your app can effectively overlap communication and
> computation*.
>
> If your app doesn't take advantage of this overlap, then you won't see
> such performance benefits. For example:
>
> MPI_Isend(buffer, ..., req);
> MPI_Wait(&req, ...);
>
> Technically, the above uses ISEND and WAIT... but it's actually probably
> going to be *slower* than using MPI_SEND because you've made multiple
> function calls with no additional work between the two -- so the app didn't
> effectively overlap the communication with any local computation. Hence:
> no performance benefit.
>
> > So what's the best optimization ? Do we have to use temporary message
> copy or not and if yes, what's the case for ?
>
> As you can probably see from my text above, the answer is: it depends. :-)
>
> --
> Jeff Squyres
> jsquyres_at_[hidden]
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>