Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Bug btl:tcp with grpcomm:hier
From: Ralph Castain (rhc_at_[hidden])
Date: 2011-03-16 21:20:45


I believe I see the problem - and why it wouldn't show up for IB. It looks like the hier module passes an incorrect flag to the modex unpack function, which causes that function to place the modex values as attributes assigned to the node instead of a process, rather than placing the values into the modex database. So when you look up a value, you get a single value for the entire node.

Works for IB because the interface info is at the node level. Doesn't work for TCP because the "interface" info is at the proc level.

Since it was only tested on IB before, this didn't show up. Should be easy to fix.

On Mar 16, 2011, at 6:15 PM, Jeff Squyres wrote:

> On Mar 16, 2011, at 5:37 PM, George Bosilca wrote:
>
>> I just checked and IB does work correctly. But then I remembered that IB is different, the connection are peer based, so they don't happens during the modex exchange. The data is exchanged over RML messages, but outside the modex.
>
> Not quite. The openib BTL does use the modex to send around connection information. The actual connections are made lazily -- just like the TCP BTL -- but the OOB CPC (i.e., the default connection mode in the openib BTL) uses RML to do the 2/3 way handshake. That's all.
>
> But the point here is: the openib BTL does rely on the modex.
>
> --
> Jeff Squyres
> jsquyres_at_[hidden]
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel