As per our last discussion, MPI_INIT(..) uses TCP socket to exchange its service-id/lid with other MPI processes. I assume this applies irrespective of underlying library used to establish connection i.e libibcm or librdmacm. Please correct me if I am wrong.
Date: Wed, 24 Aug 2011 12:06:30 -0400
From: Jeff Squyres <email@example.com>
Subject: Re: [OMPI devel] Regarding Connection establishment in
OpenMPI (Jeff Squyres)
To: Open MPI Developers <firstname.lastname@example.org>
Content-Type: text/plain; charset=us-ascii
At the moment, our only "OOB" (out of band) module uses TCP sockets. This can use traditional ethernet or an emulated IP layer, such as IPoIB.
On Aug 24, 2011, at 11:58 AM, Bhargava Ramu Kavati wrote:
> Hi Jeff,
> Thank you for your prompt response. I have a query related to MPI_INIT here. What is the underlying transport mechanism does OpenMPI uses to exchange service-id/lid via MPI_INIT, is it TCP/IP socket ?
> Thanks & Regards,
> Message: 2
> Date: Mon, 22 Aug 2011 17:33:19 -0400
> From: Jeff Squyres <email@example.com>
> Subject: Re: [OMPI devel] Regarding Connection establishment in
> To: Open MPI Developers <firstname.lastname@example.org>
> Message-ID: <2399C470-7F91-49D4-A463-A8994691ABA7@cisco.com>
> Content-Type: text/plain; charset=us-ascii
> On Aug 22, 2011, at 9:35 AM, Bhargava Ramu Kavati wrote:
> > I am trying to explore the details of connection establishment in OpenMPI using libibcm/librdmacm.
> Note that the IB community has given up on ibcm. Our support of it is incomplete; I wouldn't look at it as an example.
> > In the code, I could not find how OpenMPI app is getting service-id/lid of remote node to which it wants to connect.
> In the normal case, we pass that information during MPI_INIT. It's a global gather / broadcast operation that we refer to as the "modex" (module exchange). I.e., each openib BTL module instance publishes its address information in the modex and sends it. Near the end of MPI_INIT, each MPI process receives the modex broadcast and caches it.
> During connection establishment, an MPI process will look in its modex cache to find the connection information for the peer process that it wants to connect to.
> > Also, I did not see any query in the code related to service_record_get from SA. Can you please desribe what is happening OR Am I missing something here ?
> IIRC, we don't currently use the SA because of its serialization and other resource bottlenecks (this is a hand-waving answer; I don't remember the exact reasons for not using the SA, but there were many discussions between the MPI and OpenFabrics communities a long time ago. The SA issues were not resolved to the MPI community's liking, IIRC, but this was a long time ago, and I don't even work for an IB vendor any more, so I might not be remembering this correctly...).
> Jeff Squyres
> For corporate legal information go to:
> devel mailing list
For corporate legal information go to: