Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

From: Nysal Jan (jnysal_at_[hidden])
Date: 2006-09-27 04:12:07


I was using the v1.2 branch. Gleb's fix has resolved the problem.
Thanks
--Nysal

On 9/25/06, Jeff Squyres <jsquyres_at_[hidden]> wrote:
>
> What version of Open MPI are you using?
>
> We had a bug with this on the trunk and [unreleased] v1.2 branch; it was
> just fixed within the last few hours in both places. It should not be a
> problem in the released v1.1 series.
>
> Can you confirm that you were using the OMPI trunk or the v1.2 branch? If
> you're seeing this in the v1.1 series, then we need to look at this a bit
> closer...
>
>
> On 9/22/06 1:25 PM, "Nysal Jan" <jnysal_at_[hidden]> wrote:
>
> > The ompi_info command shows the following description for
> > "btl_openib_max_btls" parameter
> > MCA btl: parameter "btl_openib_max_btls" (current value: "-1") Maximum
> > number of HCA ports to use (-1 = use all available, otherwise must be >=
> 1)
> >
> > Even though I specify "mpirun --mca btl_openib_max_btls 1 ....." 2
> openib
> > btls are created(the HCA has 2 ports).
> > When I try to run Open MPI across 2 nodes (one node has an HCA with 2
> ports
> > and the other has only one port). Both endpoints send the QP information
> > over to the peer. Only one endpoint exists at the peer so it prints the
> > following error message:
> > [0,1,1][btl_openib_endpoint.c:706:mca_btl_openib_endpoint_recv] can't
> find
> > suitable endpoint for this peer
> >
> > [0,1,0][btl_openib_endpoint.c:913:mca_btl_openib_endpoint_connect] error
> > posting receive errno says Operation now in progress
> >
> > [0,1,0][btl_openib_endpoint.c:737:mca_btl_openib_endpoint_recv] endpoint
> > connect error: -1
> >
> > Is "btl_openib_max_btls" the maximum number of BTLs or maximum number of
> > BTLs per port (which is what the current implementation "init_one_hca()"
> > looks like)?
> >
> > -Nysal
> > _______________________________________________
> > devel mailing list
> > devel_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> --
> Jeff Squyres
> Server Virtualization Business Unit
> Cisco Systems
>