I run Netpipe on 4 different clusters with differents OSes and Eternet
devices. The results is that nearly the same behaviour happens all the
time for small messages. Basically, our latency is really bad. Attached
are 2 of the graphs on one MAC OS X cluster (wotan) and a Linux 2.6.10 32
bits one. The graph are for Netpipe compiled over tcp, and for Open MPI
with all the PMLs (uniq, teg and ob1).Here is the global trend:
- we are always slower than native TCP (what a guess!)
- uniq is faster than teg by a fraction of second (it's more visible on
- teg and uniq are always better than ob1 in terms of latency.
- the behaviour of ob1 differ on wotan and boba. On boba the performances
are a lot closer to the other PML when on wotan it's like 40 micro-second
slower (it nearly double the raw TCP latency).
On the same nodes I run other Netpipe with SM and MX and the results are
pretty good. So I think we have this latency problem only on TCP. I will
take a look to see how exactly is happens but any help is welcome.
"We must accept finite disappointment, but we must never lose infinite
Martin Luther King