Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] how to select a specific network
From: Aurélien Bouteiller (bouteill_at_[hidden])
Date: 2008-01-11 16:24:30


ibd0 and ce0 have to be on the same network for this to work. Or said
differently IP must be able to find a route between en0 and idb0. If
those are on different private networks (like 192.168.1.x and
192.168.2.x) this will not work.

Aurelien

Le 11 janv. 08 à 16:05, Rolf Vandevaart a écrit :

>
> Hello:
> Have you actually tried this and got it to work? It did not work
> for me.
>
> burl-ct-v440-0 50 =>mpirun -host burl-ct-v440-0,burl-ct-v440-1 -np 1
> -mca btl self,sm,tcp -mca btl_tcp_if_include ce0 connectivity_c : -
> np 1
> -mca btl self,sm,tcp -mca btl_tcp_if_include ce0 connectivity_c
> Connectivity test on 2 processes PASSED.
> burl-ct-v440-0 51 =>mpirun -host burl-ct-v440-0,burl-ct-v440-1 -np 1
> -mca btl self,sm,tcp -mca btl_tcp_if_include ibd0 connectivity_c : -
> np 1
> -mca btl self,sm,tcp -mca btl_tcp_if_include ibd0 connectivity_c
> Connectivity test on 2 processes PASSED.
> burl-ct-v440-0 52 =>mpirun -host burl-ct-v440-0,burl-ct-v440-1 -np 1
> -mca btl self,sm,tcp -mca btl_tcp_if_include ce0 connectivity_c : -
> np 1
> -mca btl self,sm,tcp -mca btl_tcp_if_include ibd0 connectivity_c
> --------------------------------------------------------------------------
> Process 0.1.1 is unable to reach 0.1.0 for MPI communication.
> If you specified the use of a BTL component, you may have
> forgotten a component (such as "self") in the list of
> usable components.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> It looks like MPI_INIT failed for some reason; your parallel process
> is
> likely to abort. There are many reasons that a parallel process can
> fail during MPI_INIT; some of which are due to configuration or
> environment
> problems. This failure appears to be an internal failure; here's some
> additional information (which may only be relevant to an Open MPI
> developer):
>
> PML add procs failed
> --> Returned "Unreachable" (-12) instead of "Success" (0)
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> Process 0.1.0 is unable to reach 0.1.1 for MPI communication.
> If you specified the use of a BTL component, you may have
> forgotten a component (such as "self") in the list of
> usable components.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> It looks like MPI_INIT failed for some reason; your parallel process
> is
> likely to abort. There are many reasons that a parallel process can
> fail during MPI_INIT; some of which are due to configuration or
> environment
> problems. This failure appears to be an internal failure; here's some
> additional information (which may only be relevant to an Open MPI
> developer):
>
> PML add procs failed
> --> Returned "Unreachable" (-12) instead of "Success" (0)
> --------------------------------------------------------------------------
> *** An error occurred in MPI_Init
> *** before MPI was initialized
> *** MPI_ERRORS_ARE_FATAL (goodbye)
> *** An error occurred in MPI_Init
> *** before MPI was initialized
> *** MPI_ERRORS_ARE_FATAL (goodbye)
> burl-ct-v440-0 53 =>
>
>
>
> Aurélien Bouteiller wrote:
>> Try something similar to this
>>
>> mpirun -np 1 -mca btl self,tcp -mca btl_tcp_if_include en1
>> NetPIPE_3.6/
>> NPmpi : -np 1 -mca btl self,tcp -mca btl_tcp_if_include en0
>> NetPIPE_3.6/NPmpi
>>
>> You should then be able to specify a different if_include mask for
>> you
>> different processes.
>>
>> Aurelien
>>
>> Le 11 janv. 08 à 06:46, Lydia Heck a écrit :
>>
>>> I should have added that the two networks are not routable,
>>> and that they are private class B.
>>>
>>>
>>> On Fri, 11 Jan 2008, Lydia Heck wrote:
>>>
>>>> I have a setup which contains one set of machines
>>>> with one nge and one e1000g network and of machines
>>>> with two e1000g networks configured. I am planning a
>>>> large run where all these computers will be occupied
>>>> with one job and the mpi communication should only go
>>>> over one specific network which is configured over
>>>> e1000g0 on the first set of machines and on e1000g1 on the
>>>> second set. I cannot use - for obvious reasons to either
>>>> include all of e1000g or to exclude part of e1000g - if that is
>>>> possible.
>>>> So I have to exclude or include on the internet number range.
>>>>
>>>> Is there an obvious flag - which I have not yet found - to tell
>>>> mpirun to use one specific network?
>>>>
>>>> Lydia
>>>>
>>>> ------------------------------------------
>>>> Dr E L Heck
>>>>
>>>> University of Durham
>>>> Institute for Computational Cosmology
>>>> Ogden Centre
>>>> Department of Physics
>>>> South Road
>>>>
>>>> DURHAM, DH1 3LE
>>>> United Kingdom
>>>>
>>>> e-mail: lydia.heck_at_[hidden]
>>>>
>>>> Tel.: + 44 191 - 334 3628
>>>> Fax.: + 44 191 - 334 3645
>>>> ___________________________________________
>>>>
>>> ------------------------------------------
>>> Dr E L Heck
>>>
>>> University of Durham
>>> Institute for Computational Cosmology
>>> Ogden Centre
>>> Department of Physics
>>> South Road
>>>
>>> DURHAM, DH1 3LE
>>> United Kingdom
>>>
>>> e-mail: lydia.heck_at_[hidden]
>>>
>>> Tel.: + 44 191 - 334 3628
>>> Fax.: + 44 191 - 334 3645
>>> ___________________________________________
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>>
>> Dr. Aurélien Bouteiller
>> Sr. Research Associate - Innovative Computing Laboratory
>> Suite 350, 1122 Volunteer Boulevard
>> Knoxville, TN 37996
>> 865 974 6321
>>
>>
>>
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
>
> =========================
> rolf.vandevaart_at_[hidden]
> 781-442-3043
> =========================
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users