Jeff, I did what you suggested
However no noticeable changes seem to happen. Same peaks and same latency times.
Are you sure that for disabling the Nagle's algorithm is needed just changing optval to 0?
I saw that, in btl_tcp_endpoint.c, the optval assignement is inside a
#if defined(TCP_NODELAY) block.
Where does this macro can be defined?
Any other idea for manage latency peaks?
On Jul 23, 2007, at 6:43 AM, Biagio Cosenza wrote:
> I'm working on a parallel real time renderer: an embarassing
> parallel problem where latency is the threshold to high perfomance.
> Two observations:
> 1) I did a simple "ping-pong" test (the master does a Bcast + an
> IRecv for each node + a Waitall) similar to effective renderer
> workload. Using a cluster of 37 nodes on Gigabit Ethernet, seems
> that the latency is usually low (about 1-5 ms), but sometimes there
> are some peaks of about 200 ms. I thought that the cause is a
> packet retransmission in one of the 37 connections, that blow the
> overall performance of the test (of course, the final WaitAll is a
> 2) A research team argues in a paper that MPI suffers on
> dynamically manage latency. They also arguing an interesting
> problem about enable/disable Nagle algorithm. (I paste the
> interesting paragraph below)
> So I have two questions:
> 1) Why my test have these peaks? How can I afford them (I think to
> btl tcp params)?
They are probably beyond Open MPI's control -- OMPI mainly does read
() and write() down TCP sockets and relies on the kernel to do all
the low-level TCP protocol / wire transmission stuff.
You might want to try increasing your TCP buffer sizes, but I think
that the Linux kernel has some built in limits. Other experts might
want to chime in here...
> 2) When does OpenMPI disable Nagle algorithm? Suppose I DON'T need
> that Nagle has to be ON (focusing only on latency), how can I
> increase performance?
It looks like we enable Nagle right when TCP BTL connections are
made. Surprisingly, it looks like we don't have a run-time option to
turn it off for power-users like you who want to really tweak around.
If you want to play with it, please edit ompi/mca/btl/tcp/
btl_tcp_endpoint.c. You'll see the references to TCP_NODELAY in
conjunction with setsockopt(). Set the optval to 0 instead of 1. A
simple "make install" in that directory will recompile the TCP
component and re-install it (assuming you have done a default build
with OMPI components built as standalone plugins). Let us know what
users mailing list