Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] RoCE (IBoE) & OpenMPI
From: Michael Shuey (shuey_at_[hidden])
Date: 2011-02-18 16:14:10

Per-node GID & SL settings == bad. Site-wide GID & SL settings == good.

If this could be an MCA param (like btl_openib_ib_service_level)
that'd be great - we already have a global config file of similar
params. We'd definitely want the same N everywhere.

Mike Shuey
On Fri, Feb 18, 2011 at 3:44 PM, Jeff Squyres <jsquyres_at_[hidden]> wrote:
> On Feb 18, 2011, at 1:39 PM, Michael Shuey wrote:
>> RoCE HCAs keep a GID table, like normal HCAs.  Every time you bring up
>> a vlan interface, another entry gets automatically added to the table.
>> If I select one of these other GIDs, packets get a VLAN tag, and that
>> contains the necessary priority bits (well, assuming I selected the
>> right IB service level, which is mapped to the priority tag in the
>> VLAN header) for the traffic to match a lossless class of service on
>> the switch.
> Ah -- I see it now (it's been a looong time since I've looked in Open MPI's verbs code!).  We query and simply take the 0th GID from a given IBV device port's GID table.
>> For this to work, I really need for the IB client to select a
>> non-default GID.  A few test programs included in OFED will do this,
>> but I'm not sure OpenMPI will.  Any thoughts?
> Yes, we can do this.  It's pretty easy to add an MCA parameter to select the Nth GID rather than always taking the 0th.
> To make this simple, can you make it so that the value of N is the same across all nodes in your cluster?  Then you can set a site-wide MCA param for that value of N and be done with this issue.  If we have to have a per-node setting of N, it could get a little hairy (it's do-able, but... it's a heckuva lot easier if N is the same everywhere).
> --
> Jeff Squyres
> jsquyres_at_[hidden]
> For corporate legal information go to: