Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Biagio Cosenza (biacos_at_[hidden])
Date: 2007-07-23 06:43:26

I'm working on a parallel real time renderer: an embarassing parallel
problem where latency is the threshold to high perfomance.

Two observations:

1) I did a simple "ping-pong" test (the master does a Bcast + an IRecv for
each node + a Waitall) similar to effective renderer workload. Using a
cluster of 37 nodes on Gigabit Ethernet, seems that the latency is usually
low (about 1-5 ms), but sometimes there are some peaks of about 200 ms. I
thought that the cause is a packet retransmission in one of the 37
connections, that blow the overall performance of the test (of course, the
final WaitAll is a synch).

2) A research team argues in a paper that MPI suffers on dynamically manage
latency. They also arguing an interesting problem about enable/disable Nagle
algorithm. (I paste the interesting paragraph below)

So I have two questions:

1) Why my test have these peaks? How can I afford them (I think to btl tcp

2) When does OpenMPI disable Nagle algorithm? Suppose I DON'T need that
Nagle has to be ON (focusing only on latency), how can I increase

Any useful suggestion will be REALLY appreciate.

Thanks in advance,
Biagio Cosenza

cut&paste from "Interactive Ray Tracing on Commodity PC clusters"
Saarland University, Germany

"... Communication Method: For handling communication, most parallel
processing systems today use standardized libraries such as MPI [8] or PVM
[10]. Although these libraries provide very powerful tools for development
of distributed software, they do not meet the efficiency requirements that
we face in an interactive environment.
Therefore, we had to implement all communication from scratch with standard
UNIX TCP/IP calls. Though this requires significant efforts, it allows to
extract the maximum performance out of the network. For example, consider
the 'Nagle' optimization implemented in the TCP/IP protocol, which delays
small packets for a short time period to possibly combine them with
successive packets to generate networkfriendly packet sizes. This
optimization can result in a better throughput when lots of
small packets are sent, but can also lead to considerable latencies, if a
packet gets delayed several times. Direct control of the systems
communication allows to use such optimizations selectively: For example, we
turn the Nagle optimization on for sockets in which updated scene data is
streamed to the clients, as throughput is the main issue here. On the other
hand, we turn it off for e.g. sockets used to send tiles to the clients, as
this has to be done with an absolute minimum of latency. A similar behavior
would be hard to achieve with standard communication libraries. ..."