Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] problems with MPI_Waitsome/MPI_Allstart and OpenMPI on gigabit and IB networks
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2008-07-23 17:24:00


On Jul 20, 2008, at 11:55 AM, Joe Landman wrote:

> update 2: (its like I am talking to myself ... :) must start using
> decaf ...)
>
> Joe Landman wrote:
>> Joe Landman wrote:
>
> [...]
>
>> ok, fixed this. Turns out we have ipoib going, and one adapter
>> needed to be brought down and back up. Now the tcp version appears
>> to be running, though I do get the strange hangs after a random
>> (never the same) number of iterations.
>
> ok, turned off ipoib (OFED 1.2 on this cluster), and disabled ib0 as
> a tcp port. Now, the --mca btl ^openib,sm setting results in a
> working code.

Sorry for the delay in replying -- I was in an OMPI engineering
meeting all last week, and such things tend to make me fall waaaay
behind on INBOX traffic...

Yes, we unfortunately have some fairly cryptic error messages there,
sorry about that. :-(

As you guessed, OMPI is aggressively trying to use all possible TCP-
providing interfaces. So once you either explicitly disable/enable the
interfaces that you want to use (e.g., mca_btl_tcp_if_include and
mca_btl_tcp_if_exclude -- but actually, OMPI has 2 separate systems
that use TCP, so you'll also need to use mca_oob_tcp_if_include or
mca_oob_tcp_of_include; BTL = MPI traffic, OOB = "out of band"
traffic, that OMPI uses to setup jobs, exchange information during
MPI_INIT, etc.).

http://www.open-mpi.org/faq/?category=tcp#tcp-multi-network
http://www.open-mpi.org/faq/?category=tcp#tcp-selection
http://www.open-mpi.org/faq/?category=tcp#tcp-routability

I just updated the tcp-multi-network entry to mention the oob MCA
params as well.

> This said, we have had no issues in the past with other codes on
> this cluster running them with OpenMPI on infiniband, over ipoib, or
> tcp, or shared memory. It appears that this code's use of
> MPI_Waitsome when using openib simply fails. When we use the same
> thing with two tcp ports (ipoib and gigabit), it fails at random
> iterations. Yet when we turn off ipoib, it works (as long as we
> turn off openib as well).

We added an option in 1.2.7 to disable one of OMPI's optimizations;
this optimization was called "early completion" and could result in
some applications hanging. Check out this FAQ entry and see if this
helps you out:

     http://www.open-mpi.org/faq/?category=openfabrics#v1.2-use-early-completion

Note that this option won't be necessary with the upcoming v1.3
series; we changed how our progression engine works so that OMPI can
handle these kinds of situations better.

-- 
Jeff Squyres
Cisco Systems