Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] RoCE (IBoE) & OpenMPI
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2011-02-18 15:44:02

On Feb 18, 2011, at 1:39 PM, Michael Shuey wrote:

> RoCE HCAs keep a GID table, like normal HCAs. Every time you bring up
> a vlan interface, another entry gets automatically added to the table.
> If I select one of these other GIDs, packets get a VLAN tag, and that
> contains the necessary priority bits (well, assuming I selected the
> right IB service level, which is mapped to the priority tag in the
> VLAN header) for the traffic to match a lossless class of service on
> the switch.

Ah -- I see it now (it's been a looong time since I've looked in Open MPI's verbs code!). We query and simply take the 0th GID from a given IBV device port's GID table.

> For this to work, I really need for the IB client to select a
> non-default GID. A few test programs included in OFED will do this,
> but I'm not sure OpenMPI will. Any thoughts?

Yes, we can do this. It's pretty easy to add an MCA parameter to select the Nth GID rather than always taking the 0th.

To make this simple, can you make it so that the value of N is the same across all nodes in your cluster? Then you can set a site-wide MCA param for that value of N and be done with this issue. If we have to have a per-node setting of N, it could get a little hairy (it's do-able, but... it's a heckuva lot easier if N is the same everywhere).

Jeff Squyres
For corporate legal information go to: