Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] Communitcation between OpenMPI and ClusterTools
From: Alexander Shabarshin (ashabarshin_at_[hidden])
Date: 2008-07-30 10:27:40


OK, thanks!

Is it possible to fix it somehow directly in 1.2.x codebase?

----- Original Message -----
From: "Terry Dontje" <Terry.Dontje_at_[hidden]>
To: <users_at_[hidden]>
Sent: Wednesday, July 30, 2008 7:15 AM
Subject: Re: [OMPI users] Communitcation between OpenMPI and ClusterTools

> One last note to close this out. After some discussion on the
> developers list it was pointed out that this problem was fixed with new
> code in the trunk and 1.3 branch. So my statement below of the trunk,
> 1.3 and CT8 EA2 supporting nodes on different subnets can be made
> stronger that we really do expect this to work.
>
> --td
> Terry Dontje wrote:
>> Terry Dontje wrote:
>>>>
>>>> Date: Tue, 29 Jul 2008 14:19:14 -0400
>>>> From: "Alexander Shabarshin" <ashabarshin_at_[hidden]>
>>>> Subject: Re: [OMPI users] Communitcation between OpenMPI and
>>>> ClusterTools
>>>> To: <users_at_[hidden]>
>>>> Message-ID: <00b701c8f1a7$9c24f7c0$c8afcea7_at_Shabarshin>
>>>> Content-Type: text/plain; format=flowed; charset="iso-8859-1";
>>>> reply-type=response
>>>>
>>>> Hello
>>>>
>>>>
>>>>>>>> >>> > One idea comes to mind is whether the two nodes are on the
>>>>>>>> same >>> > subnet? If they are not on the same subnet I think
>>>>>>>> there is a bug in >>> > which the TCP BTL will recuse itself
>>>>>>>> from communications between the >>> > two nodes.
>>>>>>>>
>>>>
>>>>
>>>>>> >> you are right - subnets are different, but routes set up
>>>>>> correctly and >> everything like ping, ssh etc. are working OK
>>>>>> between them
>>>>>>
>>>>
>>>>
>>>>> > But it isn't a routing problem but how the tcp btl in Open MPI
>>>>> decides > which interface the nodes can communicate with
>>>>> (completely out of the > hands of the TCP stack and lower).
>>>>>
>>>>
>>>> Do you know when it can be fixed in official OpenMPI?
>>>> Is patch available or something?
>>>>
>>> Well this problem is captured in ticket 972
>>> (https://svn.open-mpi.org/trac/ompi/ticket/972). There is a question
>>> as to whether this ticket has been fixed or not (that is was code
>>> actually putback). Sun's experience with the Trunk, 1.3 branch and
>>> CT8 EA2 release seems to be that you now can run jobs across subnets
>>> but we (Sun) are not completely
>>>
>> I guess I should have ended with "mumble..mumble" :-)
>> Now for the rest of the sentence:
>>
>> ... sure whether the support is truly in there or we just got lucky in
>> how our setup was configured.
>>
>> --td
>>> FWIW, it looks like that code has had a lot of changes in it between
>>> 1.2 and 1.3.
>>>
>>> --td
>>>
>>
>>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users