Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Bug btl:tcp with grpcomm:hier
From: Ralph Castain (rhc_at_[hidden])
Date: 2011-03-17 11:18:51


Would you mind filing these? I suspect you'll have to create patches - it might apply cleanly to 1.5, but I'm far less confident about 1.4. You might check to see if this even exists in 1.4 as I honestly don't remember.

Thanks
Ralph

On Mar 17, 2011, at 8:57 AM, Damien Guinier wrote:

> Yes please, this fixes is asked by Bull clients.
>
> damien
>
> Le 17/03/2011 15:44, Jeff Squyres a écrit :
>> Does this need to be CMR'ed to 1.4 and/or 1.5?
>>
>>
>> On Mar 16, 2011, at 10:27 PM, Ralph Castain wrote:
>>
>>
>>> Okay, I fixed this in r24536.
>>>
>>> Sorry for the problem, Damien - thanks for catching it! Went unnoticed because the folks at the Labs always use IB.
>>>
>>>
>>> On Mar 16, 2011, at 7:20 PM, Ralph Castain wrote:
>>>
>>>
>>>> I believe I see the problem - and why it wouldn't show up for IB. It looks like the hier module passes an incorrect flag to the modex unpack function, which causes that function to place the modex values as attributes assigned to the node instead of a process, rather than placing the values into the modex database. So when you look up a value, you get a single value for the entire node.
>>>>
>>>> Works for IB because the interface info is at the node level. Doesn't work for TCP because the "interface" info is at the proc level.
>>>>
>>>> Since it was only tested on IB before, this didn't show up. Should be easy to fix.
>>>>
>>>> On Mar 16, 2011, at 6:15 PM, Jeff Squyres wrote:
>>>>
>>>>
>>>>> On Mar 16, 2011, at 5:37 PM, George Bosilca wrote:
>>>>>
>>>>>
>>>>>> I just checked and IB does work correctly. But then I remembered that IB is different, the connection are peer based, so they don't happens during the modex exchange. The data is exchanged over RML messages, but outside the modex.
>>>>>>
>>>>> Not quite. The openib BTL does use the modex to send around connection information. The actual connections are made lazily -- just like the TCP BTL -- but the OOB CPC (i.e., the default connection mode in the openib BTL) uses RML to do the 2/3 way handshake. That's all.
>>>>>
>>>>> But the point here is: the openib BTL does rely on the modex.
>>>>>
>>>>> --
>>>>> Jeff Squyres
>>>>> jsquyres_at_[hidden]
>>>>> For corporate legal information go to:
>>>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> devel_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>
>>>>
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>
>>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel