|
|
amjad ali wrote:
Dear E. Loh.
Another is whether you can
overlap communications and computation.
This does not require persistent channels, but only nonblocking
communications (MPI_Isend/MPI_Irecv). Again, there are no MPI
guarantees here, so you may have to break your computation up and
insert MPI_Test calls.
You may want to get the basic functionality working first and then run
performance experiments to decide whether these really are areas that
warrant such optimizations.
CALL MPI_STARTALL
-------perform work that could
be done with local data ---------------- (A)
CALL MPI_WAITALL( )
-------perform work
using the received data --------------- (B)
In the above I have broken up the computation. In (A) I perform the
work that could be done with local data. When the recevied data is must
for remaining computations I put WAITALL to ensure that data data from
the neighbouring processes has received. I am fine with MPI_IRECV and
ISEND, i.e.,
CALL MPI_IRECV()
CALL MPI_ISEND()
-------perform work that could
be done with local data ---------------- (A)
CALL MPI_WAITALL( )
-------perform work
using the received data --------------- (B)
But I am doubtful whether I am getting computation-communication
overlap to save time.or I am getting the the same performance as could
be obtained by,
CALL MPI_IRECV()
CALL MPI_ISEND()
CALL MPI_WAITALL( )
-------perform work that could
be done with local data ---------------- (A)
-------perform work
using the received data --------------- (B)
In this case (equivalent to blocking communication), I observed that
only around 5% more time it takes.
Right. Again, MPI makes no guarantees that communications are actually
progressing between when you have posted nonblocking operations (like
Isend or Irecv) and when you force them to complete with MPI_Wait
calls. Sometimes (depending on the MPI implementation and what
interconnect is being used to effect a particular message), you have to
decompose the computation more finely. E.g., your situation might be:
CALL MPI_ISEND()
call my_work() ! no MPI progress is being made here
CALL MPI_WAIT()
and it's conceivable that you might have better performance with
CALL MPI_ISEND()
DO I = 1, N
call do_a_little_of_my_work() ! no MPI progress is being made
here
CALL MPI_TEST() ! enough MPI progress is being made
here that the receiver has something to do
END DO
CALL MPI_WAIT()
Whether performance improves or not is not guaranteed by the MPI
standard.
And the SECOND desire is to use Persistent communication
for even better speedup.
Right. That's a separate issue.
|
|
|