Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Multiple Subnet MPI Fail
From: Terry Dontje (terry.dontje_at_[hidden])
Date: 2010-11-22 09:46:38


You're gonna have to use a protocol that can route through a machine,
OFED User Verbs (ie openib) does not do this. The only way I know of to
do this via OMPI is with the tcp btl.

--td

On 11/22/2010 09:28 AM, Paul Monday (Parallel Scientific) wrote:
> We've been using OpenMPI in a switched environment with success, but
> we've moved to a point to point environment to do some work. Some of
> the nodes cannot talk directly to one another, sort of like this with
> computers A,B, C with A having two ports:
>
> A(1)(opensm)------>B
> A(2)(opensm)------>C
>
> B is not connected to C in any way.
>
> When we try to run our OpenMPI program we are receiving:
> At least one pair of MPI processes are unable to reach each other for
> MPI communications. This means that no Open MPI device has indicated
> that it can be used to communicate between these processes. This is
> an error; Open MPI requires that all MPI processes be able to reach
> each other. This error can sometimes be the result of forgetting to
> specify the "self" BTL.
>
> Process 1 ([[1581,1],5]) is on host: pg-B
> Process 2 ([[1581,1],0]) is on host: pg-C
> BTLs attempted: openib self sm
>
> Your MPI job is now going to abort; sorry.
>
>
> I hope I'm not being overly naive but, is their a way to join the
> subnets at the MPI layer? It seems like IP over IB would be too high
> up the stack.
>
> Paul Monday
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.dontje_at_[hidden] <mailto:terry.dontje_at_[hidden]>



picture