Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Change in btl/tcp
From: Josh Hursey (jjhursey_at_[hidden])
Date: 2008-04-21 09:04:28


Adrian,

Has there been any progress on this bug? If you still cannot reproduce
it, if you send either Tim Prins or I a debugging patch we can run
with it. Or we can try to arrange access to one of our machines for you.

This bug is making it difficult for us to continue working off of the
trunk since we get these connection errors so frequently.

-- Josh

On Apr 18, 2008, at 2:26 PM, Tim Prins wrote:

> To echo what Josh said, there are no special compile flags being used.
> If you send me a patch with debug output, I'd be happy to run it for
> you.
>
> Both odin and sif are fairly normal linux based clusters, with
> ethernet
> and openib IP networks. The ethernet network has both ipv4 & ipv6, and
> the openib network runs ipv4.
>
> Tim
>
> Adrian Knoth wrote:
>> On Fri, Apr 18, 2008 at 01:00:40PM -0400, Josh Hursey wrote:
>>
>>> The trick is to force Open MPI to use only tcp,self and nothing
>>> else.
>>> Did you try adding this (-mca btl tcp,self) to the runtime parameter
>>> set?
>>
>> Sure. Even with 64 processes, I cannot trigger this behaviour.
>> Neither
>> on Linux nor Solaris.
>>
>> Any special compile flags?
>>
>> I guess a little bit more debug output could probably reveal the
>> culprit.
>>
>>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel