Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Multi-rail on openib
From: Nifty Tom Mitchell (niftyompi_at_[hidden])
Date: 2009-06-05 20:50:07


On Fri, Jun 05, 2009 at 09:52:39AM -0400, Jeff Squyres wrote:
>
> See this FAQ entry for a description:
>
> http://www.open-mpi.org/faq/?category=openfabrics#ofa-port-wireup
>
> Right now, there's no way to force a particular connection pattern on
> the openib btl at run-time. The startup sequence has gotten
> sufficiently complicated / muddied over the years that it would be quite
> difficult to do so. Pasha is in the middle of revamping parts of the
> openib startup (see http://bitbucket.org/pasha/ompi-ofacm/); it *may* be
> desirable to fully clean up the full openib btl startup sequence when
> he's all finished.
>
>
> On Jun 5, 2009, at 9:48 AM, Mouhamed Gueye wrote:
>
>> Hi all,
>>
>> I am working on multi-rail IB and I was wondering how connections are
>> established between ports. I have two hosts, each with 2 ports on a
>> same IB card, connected to the same switch.
>>

Is there a goal in mind?

In general multi-rail cards run into bandwidth and congestion issues
with the host bus. If your card's system side interface cannot support
the bandwidth of twin IB links then it is possible that bandwidth would
be reduced by the interaction.

If the host bus and memory system is fast enough then
work with the vendor.

In addition to system bandwidth the subnet manager may need to be enhanced
to be multi-port card aware. Since IB fabric routes are static it is possible
to route or use pairs of links in an identical enough way that there is
little bandwidth gain when multiple switches are involved.

Your two host case case may be simple enough....to explore
and/or generate illuminating or misleading results.
It is a good place to start.

Start with a look at opensm and the fabric then watch how Open MPI
or your applications use the resulting LIDs. If you are using IB directly
and not MPI then the list of protocol choices grows dramatically but still
centers on LIDs as assigned by the subnet manager (see opensm).

How man CPU cores (ranks) are you working with?

Do be specific about the IB hardware and associated firmware....
there are multiple choices out there and the vendor may be able to help.......

-- 
	T o m  M i t c h e l l 
	Found me a new hat, now what?