Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] How to reduce Isend & Irecv bandwidth?
From: Gus Correa (gus_at_[hidden])
Date: 2013-05-01 18:05:04


Hi Thomas/Jacky

Maybe using MPI_Probe (and maybe also MPI_Cancel)
to probe the message size,
and receive only those with size>0?
Anyway, I'm just code-guessing.

I hope it helps,
Gus Correa

On 05/01/2013 05:14 PM, Thomas Watson wrote:
> Hi Gus,
>
> Thanks for your suggestion!
>
> The problem of this two-phased data exchange is as follows. Each rank
> can have data blocks that will be exchanged to potentially all other
> ranks. So if a rank needs to tell all the other ranks about which blocks
> to receive, it would require an all-to-all collective communication
> during phase one (e.g., MPI_Gatherallv). Because such collective
> communication is blocking in current stable OpenMPI (MPI-2), it would
> have a negative impact on scalability of the application, especially
> when we have a large number of MPI ranks. This negative impact would not
> be compensated by the bandwidth saved :-)
>
> What I really need is something like this: Isend sets count to 0 if a
> block is not dirty. On the receiving side, MPI_Waitall deallocates the
> corresponding Irecv request immediately and sets the Irecv request
> handle to MPI_REQUEST_NULL as if it were a normal Irecv. I am wondering
> if someone could confirm this behavior with me? I could do an experiment
> on this too...
>
> Regards,
>
> Jacky
>
>
>
>
> On Wed, May 1, 2013 at 3:46 PM, Gus Correa <gus_at_[hidden]
> <mailto:gus_at_[hidden]>> wrote:
>
> Maybe start the data exchange by sending a (presumably short)
> list/array/index-function of the dirty/not-dirty blocks status
> (say, 0=not-dirty,1=dirty),
> then putting if conditionals before the Isend/Irecv so that only
> dirty blocks are exchanged?
>
> I hope this helps,
> Gus Correa
>
>
>
>
> On 05/01/2013 01:28 PM, Thomas Watson wrote:
>
> Hi,
>
> I have a program where each MPI rank hosts a set of data blocks.
> After
> doing computation over *some of* its local data blocks, each MPI
> rank
> needs to exchange data with other ranks. Note that the
> computation may
> involve only a subset of the data blocks on a MPI rank. The data
> exchange is achieved at each MPI rank through Isend and Irecv
> and then
> Waitall to complete the requests. Each pair of Isend and Irecv
> exchanges
> a corresponding pair of data blocks at different ranks. Right
> now, we do
> Isend/Irecv for EVERY block!
>
> The idea is that because the computation at a rank may only
> involves a
> subset of blocks, we could mark those blocks as dirty during the
> computation. And to reduce data exchange bandwidth, we could only
> exchanges those *dirty* pairs across ranks.
>
> The problem is: if a rank does not compute on a block 'm', and if it
> does not call Isend for 'm', then the receiving rank must
> somehow know
> this and either a) does not call Irecv for 'm' as well, or b)
> let Irecv
> for 'm' fail gracefully.
>
> My questions are:
> 1. how Irecv will behave (actually how MPI_Waitall will behave)
> if the
> corresponding Isend is missing?
>
> 2. If we still post Isend for 'm', but because we really do not
> need to
> send any data for 'm', can I just set a "flag" in Isend so that
> MPI_Waitall on the receiving side will "cancel" the
> corresponding Irecv
> immediately? For example, I can set the count in Isend to 0, and
> on the
> receiving side, when MPI_Waitall see a message with empty
> payload, it
> reclaims the corresponding Irecv? In my code, the correspondence
> between
> a pair of Isend and Irecv is established by a matching TAG.
>
> Thanks!
>
> Jacky
>
>
> _________________________________________________
> users mailing list
> users_at_[hidden] <mailto:users_at_[hidden]>
> http://www.open-mpi.org/__mailman/listinfo.cgi/users
> <http://www.open-mpi.org/mailman/listinfo.cgi/users>
>
>
> _________________________________________________
> users mailing list
> users_at_[hidden] <mailto:users_at_[hidden]>
> http://www.open-mpi.org/__mailman/listinfo.cgi/users
> <http://www.open-mpi.org/mailman/listinfo.cgi/users>
>
>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users