David Zhang wrote:
Blocking send/recv, as the name suggest, stop processing
your master and slave code until the data is received on the slave side.
Just to clarify...
If you use point-to-point send and receive calls, you can make the
block/nonblock decision independently on the send and receive sides.
E.g., use blocking send and nonblocking receive. Or nonblocking send
and blocking receive. You get the idea.
Blocking on the send side only means blocking until the message has
left the user's buffer on the send side. It does not guarantee that
the data has been received on the other end.
I agree with Bill that performance portability is an issue. That is,
the MPI standard itself doesn't really provide any guarantees here
about what is fastest. Perhaps polling this mailing list will be
helpful, but if you are looking for "the fastest" solution regardless
of which MPI implementation you use (and which interconnect you use...
which might be determined at run time) you will probably be
Using a collective call like MPI_Gather may be worthwhile, but it
doesn't deploy additional threads, and additional threads could indeed
help in certain cases.
In addition to MPI implementation and which interconnect (or BTL) one
uses, another important variable is message length. Short messages may
be sent "eagerly" while long messages may involve more synchronization
between master and slaves.
Nonblocking send/recv wouldn't stop, instead you must
check the status on the slave side to see if data has been sent.
Yes and no. Again, data can be sent from the master but not yet
received by the slave (if the MPI implementation buffers the data
Nonblocking is faster on the master side because the
master doesn't need to wait for the slave to receive the data to
??? For most sends, the master has to wait only on the data to leave
the user send buffer.
So when you say you want your master to send "as fast as
possible", I suppose you meant get back to running your code as soon as
possible. In that case you would want nonblocking. However when you
say you want the slaves to receive data faster, it seems you're
implying the actual data transmission across the network. I believe
the data transmission speed is not dependent on whether the it is
blocking or nonblocking.
On Sun, Jan 30, 2011 at 11:09 AM, Toon
If I have a master-process that needs to send a chunk of (different)
data to each of my N slave processes as fast as possible, would I
receive the chunk in each of the slaves faster if the master would
launch N threads each doing a blocking send or would it be better to
launch N nonblocking sends in the master.
I'm currently using OpenMPI on ethernet but might the approach be
different with different types of networks ?
thanks in advance,