Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Openmpi not using IB and no warning message
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2009-10-26 15:20:50


On Oct 15, 2009, at 2:14 AM, Sangamesh B wrote:

> I've run ibpingpong tests. They are working fine.

Sorry for the delay in replying.

Good.

> Are there any additional tests available which will make sure that
> "there is no problem with IB software and Open MPI. The problem is
> with Application or IB hardware"?

George mentioned the point that using "--mca btl openib,self" will
only allow OMPI to use those two networks. So you should be good
there -- with those command line options, it'll either run on IB or it
will fail to run if the IB is not working.

Unfortunately, OMPI currently only has a negative acknowledgement when
you're *not* using high-performance networks -- it doesn't give you a
positive acknowledgement when it *is* using a high-performance network
(because this is the much more common case).

> Because we've faced some critical problems:
>
> http://www.open-mpi.org/community/lists/users/2009/10/10843.php

This one *appears* to be an application issue. But there was no
information provided beyond the initial posting, so it's impossible to
say.

> http://www.open-mpi.org/community/lists/users/2009/09/10700.php

Pasha had a good reply to this post:

     http://www.open-mpi.org/community/lists/users/2009/09/10705.php

If he's right (and he usually is :-) ), then one of your IB ports when
from ACTIVE to DOWN during the run, potentially indicating bad
hardware (i.e., Open MPI simply reported the error -- it's possible/
likely that Open MPI didn't *cause* the error). Pasha suggested using
ibdiagnet to verify your fabric. Failing that, you might want to
contact your IB/cluster vendor for assistance with a layer-0
diagnostic of your IB fabric.

Hope that helps!

-- 
Jeff Squyres
jsquyres_at_[hidden]