Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Low Open MPI performance on InfiniBand and shared memory?
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2010-07-15 10:02:29


(still trolling through the history in my INBOX...)

On Jul 9, 2010, at 8:56 AM, Andreas Schäfer wrote:

> On 14:39 Fri 09 Jul , Peter Kjellstrom wrote:
> > 8x pci-express gen2 5GT/s should show figures like mine. If it's pci-express
> > gen1 or gen2 2.5GT/s or 4x or if the IB only came up with two lanes then 1500
> > is expected.
>
> lspci and ibv_devinfo tell me it's PCIe 2.0 x8 and InfiniBand 4x QDR
> (active_width 4X, active_speed 10.0 Gbps), so I /should/ be able to
> get about twice the throughput of what I'm currently seeing.

You'll get different shared memory performance if you bind both the local procs to a single socket or two different sockets. I don't know much about AMDs, so I can't say exactly what it'll do offhand.

As for the IB performance, you want to make sure that your MPI process is bound to a core that is "near" the HCA for minimum latency and max bandwidth. Then also check that your IB fabric is clean, etc. I believe that OFED comes with a bunch of verbs-level latency and bandwidth unit tests that can measure what you're getting across your fabric (i.e., raw network performance without MPI). It's been a while since I've worked deeply with OFED stuff; I don't remember the command names offhand -- perhaps ibv_rc_pingpong, or somesuch?

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/