I was using the v1.2 branch. Gleb's fix has resolved the problem.
On 9/25/06, Jeff Squyres <jsquyres_at_[hidden]> wrote:
> What version of Open MPI are you using?
> We had a bug with this on the trunk and [unreleased] v1.2 branch; it was
> just fixed within the last few hours in both places. It should not be a
> problem in the released v1.1 series.
> Can you confirm that you were using the OMPI trunk or the v1.2 branch? If
> you're seeing this in the v1.1 series, then we need to look at this a bit
> On 9/22/06 1:25 PM, "Nysal Jan" <jnysal_at_[hidden]> wrote:
> > The ompi_info command shows the following description for
> > "btl_openib_max_btls" parameter
> > MCA btl: parameter "btl_openib_max_btls" (current value: "-1") Maximum
> > number of HCA ports to use (-1 = use all available, otherwise must be >=
> > Even though I specify "mpirun --mca btl_openib_max_btls 1 ....." 2
> > btls are created(the HCA has 2 ports).
> > When I try to run Open MPI across 2 nodes (one node has an HCA with 2
> > and the other has only one port). Both endpoints send the QP information
> > over to the peer. Only one endpoint exists at the peer so it prints the
> > following error message:
> > [0,1,1][btl_openib_endpoint.c:706:mca_btl_openib_endpoint_recv] can't
> > suitable endpoint for this peer
> > [0,1,0][btl_openib_endpoint.c:913:mca_btl_openib_endpoint_connect] error
> > posting receive errno says Operation now in progress
> > [0,1,0][btl_openib_endpoint.c:737:mca_btl_openib_endpoint_recv] endpoint
> > connect error: -1
> > Is "btl_openib_max_btls" the maximum number of BTLs or maximum number of
> > BTLs per port (which is what the current implementation "init_one_hca()"
> > looks like)?
> > -Nysal
> > _______________________________________________
> > devel mailing list
> > devel_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Jeff Squyres
> Server Virtualization Business Unit
> Cisco Systems