The all2all scheduling works only because we know they will all send
the same amount of data, so the communications will take "nearly" the
same time. Therefore, we can predict how to schedule the
communications to get the best out of the network. But this approach
can lead to worst performance for all2allv in most of the cases. From
a user perspective we can imagine that if they will send roughly the
same size they can get some benefit from this approach. But now from
the MPI library how we can figure out the amount that have to be send
globally ? Each one of the processes knows only how much data it have
to send and how much data it have to receive ... but unfortunately
does not have any informations about the communications that will
take place between the others ...
Of course we can do an all2all with the sizes before taking the
decision on how to do the all2allv but the cost can be prohibited on
most of the cases.
Anyway, we're working on this issue and hopefully we will have a
On Feb 7, 2006, at 11:45 AM, Konstantin Kudin wrote:
> Hi all,
> I was wondering if it would be possible to use the same scheduling
> "alltoallv" as for "alltoall". If one assumes the messages of roughly
> the same size, then "alltoall" would not be an unreasonable
> approximation for "alltoallv". As is, it appears that in v1.1
> "alltoallv" is done via a bunch of "isend+irecv", while "alltoall"
> is a
> bit more clever.
> One could then also have a runtime flag to use this sort of
"Half of what I say is meaningless; but I say it so that the other
half may reach you"