Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] RoCE (IBoE) & OpenMPI
From: Michael Shuey (shuey_at_[hidden])
Date: 2011-02-24 08:00:09


Late yesterday I did have a chance to test the patch Jeff provided
(against 1.4.3 - testing 1.5.x is on the docket for today). While it
works, in that I can specify a gid_index, it doesn't do everything
required - my traffic won't match a lossless CoS on the ethernet
switch. Specifying a GID is only half of it; I really need to also
specify a service level.

The bottom 3 bits of the IB SL are mapped to ethernet's PCP bits in
the VLAN tag. With a non-default gid, I can select an available VLAN
(so RoCE's packets will include the PCP bits), but the only way to
specify a priority is to use an SL. So far, the only RoCE-enabled app
I've been able to make work correctly (such that traffic matches a
lossless CoS on the switch) is ibv_rc_pingpong - and then, I need to
use both a specific GID and a specific SL.

The slides Pavel found seem a little misleading to me. The VLAN isn't
determined by bound netdev; all VLAN netdevs map to the same IB
adapter for RoCE. VLAN is determined by gid index. Also, the SL
isn't determined by a set kernel policy; it's provided via the IB
interfaces. As near as I can tell from Mellanox's documentation, OFED
test apps, and the driver source, a RoCE adapter is an Infiniband card
in almost all respects (even more so than an iWARP adapter).

--
Mike Shuey
On Wed, Feb 23, 2011 at 5:03 PM, Jeff Squyres <jsquyres_at_[hidden]> wrote:
> On Feb 23, 2011, at 3:54 PM, Shamis, Pavel wrote:
>
>> I remember that I updated the trunk to select by default RDMACM connection manager for RoCE ports - https://svn.open-mpi.org/trac/ompi/changeset/22311
>>
>> I'm not sure it the change made his way to any production version. I don't work on this part code anymore :-)
>
> Mellanox -- can you follow up on this?
>
> Also, in addition to the patches I provided for selecting an arbitrary GID (I was planning on committing them when Mike tested them at Purdue, but perhaps I should just commit to the trunk anyway), perhaps we should check if a non-default SL is supplied via MCA param in the RoCE case and output an orte_show_help to warn that it will have no effect (i.e., principle of least surprise and all that).
>
> --
> Jeff Squyres
> jsquyres_at_[hidden]
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>