With Mellanox's new HCA (ConnectX), extremely low latencies are
possible for short messages between two MPI processes. Currently,
OMPI's latency is around 1.9us while all other MPI's (HP MPI, Intel
MPI, MVAPICH, etc.) are around 1.4us. A big reason for this
difference is that, at least with MVAPICH, they are doing wire
protocol header caching where the openib BTL does not. Specifically:
- Mellanox tested MVAPICH with the header caching; latency was around
- Mellanox tested MVAPICH without the header caching; latency was
Given that OMPI is the lone outlier around 1.9us, I think we have no
choice except to implement the header caching and/or examine our
header to see if we can shrink it. Mellanox has volunteered to
implement header caching in the openib btl.
Any objections? We can discuss what approaches we want to take
(there's going to be some complications because of the PML driver,
etc.); perhaps in the Tuesday Mellanox teleconf...?