Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2007-04-03 17:29:17


I have filed a ticket for this:

     https://svn.open-mpi.org/trac/ompi/ticket/972

On Apr 3, 2007, at 5:18 PM, Xie, Hugh wrote:

>
> I think that workaround you purposed would resolve this problem.
>
> -----Original Message-----
> From: users-bounces_at_[hidden] [mailto:users-bounces_at_open-
> mpi.org] On
> Behalf Of Jeff Squyres
> Sent: Tuesday, April 03, 2007 5:05 PM
> To: Open MPI Users
> Subject: Re: [OMPI users] Mpirun failed for machines not in the same
> subnet.
>
> Do your different subnets violate the assumptions listed here?
>
> http://www.open-mpi.org/faq/?category=tcp#tcp-routability
>
> We have not implemented any workarounds to say "subnet X is
> routable to
> subnet Y" because no one had asked for them. Such workarounds are
> possible, of course, but I don't know what kind of timeframe we
> would be
> able to implement them in. Contributions would always be
> accepted! :-)
>
> Probably the easiest workaround would be a top-level MCA parameter
> that
> effectively tells OMPI to assume that *all* TCP addresses are routable
> to each other. That might not be too difficult to implement.
>
>
> On Apr 3, 2007, at 4:11 PM, Xie, Hugh wrote:
>
>>
>> Hi,
>>
>> I got the follow error message while running: 'mpirun -v -np 2
>> -machinefile hosts.txt testc.x'
>>
>> Process 0.1.1 is unable to reach 0.1.0 for MPI communication.
>> If you specified the use of a BTL component, you may have forgotten a
>> component (such as "self") in the list of usable components.
>> ---------------------------------------------------------------------
>> -
>> --
>> --
>> ---------------------------------------------------------------------
>> -
>> --
>> --
>> Process 0.1.0 is unable to reach 0.1.1 for MPI communication.
>> If you specified the use of a BTL component, you may have forgotten a
>> component (such as "self") in the list of usable components.
>> ---------------------------------------------------------------------
>> -
>> --
>> --
>> ---------------------------------------------------------------------
>> -
>> --
>> --
>> It looks like MPI_INIT failed for some reason; your parallel process
>> is likely to abort. There are many reasons that a parallel process
>> can fail during MPI_INIT; some of which are due to configuration or
>> environment problems. This failure appears to be an internal
>> failure;
>
>> here's some additional information (which may only be relevant to an
>> Open MPI
>> developer):
>>
>> PML add procs failed
>> --> Returned "Unreachable" (-12) instead of "Success" (0)
>> ---------------------------------------------------------------------
>> -
>> --
>> --
>>
>>
>> The same commands works if the content in hosts.txt is in same
>> subnet.
>> Once I switch to hosts in different subnet, it stop working. I am
>> using ompi 1.2.
>> Please help.
>>
>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>> - - - - - - - - - -
>>
>> This message is intended only for the personal and confidential
>> use of
>
>> the designated recipient(s) named above. If you are not the intended
>> recipient of this message you are hereby notified that any review,
>> dissemination, distribution or copying of this message is strictly
>> prohibited. This communication is for information purposes only and
>> should not be regarded as an offer to sell or as a solicitation of an
>> offer to buy any financial product, an official confirmation of any
>> transaction, or as an official statement of Lehman Brothers. Email
>> transmission cannot be guaranteed to be secure or error-free.
>> Therefore, we do not represent that this information is complete or
>> accurate and it should not be relied upon as such. All
>> information is
>
>> subject to change without notice.
>>
>> --------
>> IRS Circular 230 Disclosure:
>> Please be advised that any discussion of U.S. tax matters contained
>> within this communication (including any attachments) is not intended
>> or written to be used and cannot be used for the purpose of (i)
>> avoiding U.S. tax related penalties or (ii) promoting, marketing or
>> recommending to another party any transaction or matter addressed
>> herein.
>>
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> Cisco Systems
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> - - - - - - - - - -
>
> This message is intended only for the personal and confidential use
> of the designated recipient(s) named above. If you are not the
> intended recipient of this message you are hereby notified that any
> review, dissemination, distribution or copying of this message is
> strictly prohibited. This communication is for information
> purposes only and should not be regarded as an offer to sell or as
> a solicitation of an offer to buy any financial product, an
> official confirmation of any transaction, or as an official
> statement of Lehman Brothers. Email transmission cannot be
> guaranteed to be secure or error-free. Therefore, we do not
> represent that this information is complete or accurate and it
> should not be relied upon as such. All information is subject to
> change without notice.
>
> --------
> IRS Circular 230 Disclosure:
> Please be advised that any discussion of U.S. tax matters contained
> within this communication (including any attachments) is not
> intended or written to be used and cannot be used for the purpose
> of (i) avoiding U.S. tax related penalties or (ii) promoting,
> marketing or recommending to another party any transaction or
> matter addressed herein.
>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
Cisco Systems