Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Role of ethernet interfaces of startup of openmpi job using IB
From: Salvatore Podda (salvatore.podda_at_[hidden])
Date: 2011-09-30 06:29:31


Thanks for the prompt reply!

>
> On Sep 27, 2011, at 6:35 AM, Salvatore Podda wrote:
>
>> We would like to know if the ethernet interfaces play any role in
>> the startup phase of an opempi job using InfiniBand
>> In this case, where we can found some literature on this topic?
>
> Unfortunately, there's not a lot of docs about this other than
> people asking questions on this list.
>

For the above reason, does anyone, in the list, know which the order/
ranking by which the
ethernet interfaces will be qeuried in the case of multiple ones?
And which are the rules?

Regards

Salvatore Podda
> IP is used by default during Open MPI startup. Specifically, it is
> used as our "out of band" communication channel for things like
> stdin/stdout/stderr redirection, launch command relaying, process
> control, etc. The OOB channel is also used by default for
> bootstrapping IB queue pairs.
>
> To clarify, note that these are two different things:
>
> 1. the out of band (OOB) channel used for process control, std*
> routing, etc.
> 2. bootstrapping IB queue pairs
>
> You can change the IB QP bootstrapping to use the OpenFabrics RDMA
> communications manager (vs. our OOB channel) with the following:
>
> mpirun --mca btl_openib_if_cpc rdmacm ...
>
> See if that helps (although the OF RDMA CM has its own scalability
> issues, also associated with ARP).
>
> If your cluster is large, you might want to check out the section on
> our FAQ about large clusters:
>
> http://www.open-mpi.org/faq/?category=large-clusters
>
> I don't think there's an entry on there yet about this, but it may
> also be worthwhile to try enabling the "radix" support; a more
> scalable version of our OOB channel (i.e., the tree across all the
> support daemons has a much larger radix and is therefore much
> flatter). Los Alamos recently committed an IB UD OOB channel plugin
> to our development trunk and is comparing its performance to the
> radix tree to see if it's worthwhile.
>
> --
> Jeff Squyres
> jsquyres_at_[hidden]
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
>