When only sending a few messages, we get reasonably good IB performance,
~500MB/s (MVAPICH is 850MB/s). However, if I crank the number of
messages up, we drop to 3MB/s(!!!). This is with the OSU NBCL
mpi_bandwidth test. We are running Mellanox IB Gold 1.8 with 3.3.3
firmware on PCI-X (Couger) boards. Everything works with MVAPICH, but
we really need the thread support in OpenMPI.
Ideas? I noticed there are a plethora of runtime options configurable
for mvapi. Do I need to tweak these to get performacne up?