Dear all,
I would like you to ask for a topic that there are already many
questions but I am not familiar a lot with it. I want to
understand the behaviour of an application where there are many
messages less than 64KB (eager mode) and I use TCP network. I am
trying to understand in order to simulate this application.
For example it can be possible to have one MPI_Send of 1200 bytes
after some computation, then two messages of the same size, after
computation, etc. However according to the measurements and the
profiling the cost of the communication is less than the latency
of the network. I can understand that the cost of the MPI_Send is
the copy to the buffer however sending the message to the
destination it should cost at least the latency. So are the
messages buffered in the sender and they are sent as packet to the
receiver? My tcp window is 4MB and I use the same value for
snd_buff and rcv_buff. If they are buffered in the sender what is
the criterion/algorithm? I mean if I have one message, after
computation and after again message is it possible these two
messages to be buffered from the sender point of view or this
happens only on the receiver? If there is any document/paper that
I can read about this I would be appreciate to provide me the
link.
A simple example is that if I have a loop that rank 0 sends two
messages to rank 1 then the duration of the first message is
bigger than the second's one and if I increase the loop to 10 or
20 messages then all the messages cost a lot less than the first
one and also less from what SkaMPI measures. So I am sure that it
should be a buffer issue (or something else that I can't think
about).
Best regards,
Georges