On Friday 09 July 2010, Andreas Schäfer wrote:
> I'm evaluating Open MPI 1.4.2 on one of our BladeCenters and I'm
> getting via InfiniBand about 1550 MB/s and via shared memory about
> 1770 for the PingPong benchmark in Intel's MPI benchmark. (That
> benchmark is just an example, I'm seeing similar numbers for my own
Two factors that make a big difference, size of the operations and type of
node (cpu model).
On an E5520 (nehalem) node I get ~5 GB/s ping-pong for >64K sizes.
On QDR IB on similar nodes I get ~3 GB/s ping-pong for >256K.
Numbers are for 1.4.1 YMMV. I couldn't find an AMD node similar to yours,
> Each node has two AMD hex-cores and two 40 Gbps InfiniBand ports, so I
> wonder if I shouldn't be getting a significantly higher throughput on
> InfiniBand. Considering the CPUs' memory bandwidth, I believe that
> shared memory throughput should be much higher as well.
> Are those numbers what is to be expected? If not: any ideas how to
> debug this or tune Open MPI?
> Thanks in advance
> ps: if it's any help, this is what iblinkinfo is telling me
> (tests were run on faui36[bc])