Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Regarding Connection establishment in OpenMPI (Jeff Squyres)
From: Bhargava Ramu Kavati (ramu.kavati_at_[hidden])
Date: 2011-08-24 11:58:44


Hi Jeff,
Thank you for your prompt response. I have a query related to MPI_INIT
here. What is the underlying transport mechanism does OpenMPI uses to
exchange service-id/lid via MPI_INIT, is it TCP/IP socket ?

Thanks & Regards,
Ramu

Message: 2
> Date: Mon, 22 Aug 2011 17:33:19 -0400
> From: Jeff Squyres <jsquyres_at_[hidden]>
> Subject: Re: [OMPI devel] Regarding Connection establishment in
> OpenMPI
> To: Open MPI Developers <devel_at_[hidden]>
> Message-ID: <2399C470-7F91-49D4-A463-A8994691ABA7_at_[hidden]>
> Content-Type: text/plain; charset=us-ascii
>
> On Aug 22, 2011, at 9:35 AM, Bhargava Ramu Kavati wrote:
>
> > I am trying to explore the details of connection establishment in OpenMPI
> using libibcm/librdmacm.
>
> Note that the IB community has given up on ibcm. Our support of it is
> incomplete; I wouldn't look at it as an example.
>
> > In the code, I could not find how OpenMPI app is getting service-id/lid
> of remote node to which it wants to connect.
>
> In the normal case, we pass that information during MPI_INIT. It's a
> global gather / broadcast operation that we refer to as the "modex" (module
> exchange). I.e., each openib BTL module instance publishes its address
> information in the modex and sends it. Near the end of MPI_INIT, each MPI
> process receives the modex broadcast and caches it.
>
> During connection establishment, an MPI process will look in its modex
> cache to find the connection information for the peer process that it wants
> to connect to.
>
> > Also, I did not see any query in the code related to service_record_get
> from SA. Can you please desribe what is happening OR Am I missing something
> here ?
>
> IIRC, we don't currently use the SA because of its serialization and other
> resource bottlenecks (this is a hand-waving answer; I don't remember the
> exact reasons for not using the SA, but there were many discussions between
> the MPI and OpenFabrics communities a long time ago. The SA issues were not
> resolved to the MPI community's liking, IIRC, but this was a long time ago,
> and I don't even work for an IB vendor any more, so I might not be
> remembering this correctly...).
>
> --
> Jeff Squyres
> jsquyres_at_[hidden]
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
>