Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Tim S. Woodall (twoodall_at_[hidden])
Date: 2005-11-29 15:40:00


George Bosilca wrote:
> Tim,
>
> It looks a little bit better. Here are the latencies for 1 to 4 bytes
> messages as well as for the maximum length in Netpipe (8 MB).
>
> old ob1:
> 0: 1 bytes 694 times --> 0.06 Mbps in 137.54 usec
> 1: 2 bytes 727 times --> 0.11 Mbps in 140.54 usec
> 2: 3 bytes 711 times --> 0.16 Mbps in 141.54 usec
> 3: 4 bytes 470 times --> 0.22 Mbps in 140.55 usec
> 121: 8388605 bytes 3 times --> 889.75 Mbps in 71929.97 usec
> 122: 8388608 bytes 3 times --> 889.72 Mbps in 71932.47 usec
> 123: 8388611 bytes 3 times --> 889.59 Mbps in 71943.16 usec
>
> new ob1:
> 0: 1 bytes 760 times --> 0.07 Mbps in 116.08 usec
> 1: 2 bytes 861 times --> 0.13 Mbps in 116.73 usec
> 2: 3 bytes 856 times --> 0.20 Mbps in 116.69 usec
> 3: 4 bytes 571 times --> 0.26 Mbps in 117.48 usec
> 121: 8388605 bytes 3 times --> 890.37 Mbps in 71880.14 usec
> 122: 8388608 bytes 3 times --> 890.33 Mbps in 71883.64 usec
> 123: 8388611 bytes 3 times --> 890.40 Mbps in 71878.00 usec
>
> teg:
> 0: 1 bytes 867 times --> 0.07 Mbps in 114.91 usec
> 1: 2 bytes 870 times --> 0.13 Mbps in 115.99 usec
> 2: 3 bytes 862 times --> 0.20 Mbps in 114.37 usec
> 3: 4 bytes 582 times --> 0.26 Mbps in 115.20 usec
> 121: 8388605 bytes 3 times --> 893.42 Mbps in 71634.49 usec
> 122: 8388608 bytes 3 times --> 893.22 Mbps in 71651.18 usec
> 123: 8388611 bytes 3 times --> 893.24 Mbps in 71649.35 usec
>
> uniq:
> 0: 1 bytes 870 times --> 0.07 Mbps in 114.59 usec
> 1: 2 bytes 872 times --> 0.13 Mbps in 114.20 usec
> 2: 3 bytes 875 times --> 0.20 Mbps in 114.52 usec
> 3: 4 bytes 582 times --> 0.27 Mbps in 113.70 usec
> 121: 8388605 bytes 3 times --> 893.41 Mbps in 71635.64 usec
> 122: 8388608 bytes 3 times --> 893.57 Mbps in 71623.01 usec
> 123: 8388611 bytes 3 times --> 893.39 Mbps in 71637.67 usec
>
> raw tcp:
> 0: 1 bytes 1081 times --> 0.08 Mbps in 90.74 usec
> 1: 2 bytes 1102 times --> 0.17 Mbps in 90.88 usec
> 2: 3 bytes 1100 times --> 0.25 Mbps in 90.66 usec
> 3: 4 bytes 735 times --> 0.34 Mbps in 89.21 usec
> 121: 8388605 bytes 3 times --> 894.90 Mbps in 71516.32 usec
> 122: 8388608 bytes 3 times --> 894.99 Mbps in 71508.84 usec
> 123: 8388611 bytes 3 times --> 894.96 Mbps in 71511.51 usec
>
> The changes seems to remove around 20 micro-seconds ... that's not
> bad. However, we are still 25 microseconds away from the raw TCP
> stream. And these 25 microsecond should come from somewhere on the
> TCP BTL because the request management is quite quick in Open MPI. I
> think I have an explanation. In the Netpipe TCP they are not using
> any select or poll functions, they just wait on the receive until
> they get the full messages. As we do a select/poll before going to
> read from the socket that should explain at least partially the
> difference. But there is still a small problem. Why ob1 is 3 micro-
> seconds slower than teg or uniq ?

Due to the structure of the pml/btl interface, the btl code does an
extra recv call. The cost of this varies, on odin it appears to be
closer to 0.5us... Have to think about this a bit - may be able to
remove this.

Tim