|
|
Jack Bryan wrote:
The master node can receive message ( the same size) from 50
worker nodes.
But, it cannot receive message from 51 nodes. It caused
"truncate error".
How big was the buffer that the program specified in the receive call?
How big was the message that was sent?
MPI_ERR_TRUNCATE means that you posted a receive with an application
buffer that turned out to be too small to hold the message that was
received. It's a user application error that has nothing to do with
MPI's internal buffers. MPI's internal buffers don't need to be big
enough to hold that message. MPI could require the sender and receiver
to coordinate so that only part of the message is moved at a time.
I used the same buffer to get the message in 50 node case.
About ""rendezvous" protocol", what is the meaning of "the
sender sends a short portion "?
What is the "short portion", is it a small mart of the message
of the sender ?
It's at least the message header (communicator, tag, etc.) so that the
receiver can figure out if this is the expected message or not. In
practice, there is probably also some data in there as well. The
amount of that portion depends on the MPI implementation and, in
practice, the interconnect the message traveled over,
MPI-implementation-dependent environment variables set by the user,
etc. E.g., with OMPI over shared memory by default it's about 4Kbytes
(if I remember correctly).
This "rendezvous" protocol" can work automatically in background
without programmer
indicates in his program ?
Right. MPI actually allows you to force such synchronization with
MPI_Ssend, but typically MPI implementations use it automatically for
"plain" long sends as well even if the user didn't not use MPI_Ssend.
The "acknowledgement " can be generated by the receiver only
when the
corresponding mpi_irecv is posted by the receiver ?
Right.
|
|
|