After further reflection I wonder if you have a firewall that is
preventing connections to certain ports.
Terry Dontje wrote:
> Hello Sofia,
> Ok, so I really wanted the stack of when you run with "-mca
> mpi_preconnect_all 1" I believe you'll see that one of the processes
> will be in init. However, the stack still probably will not help me
> help you. What needs to happen is to step through the code in dbx
> while the connection is trying to be established. I am hoping you
> might find the connect call fails or that we've been given an
> interface that somehow cannot reach the other node. However, when you
> specified "-mca btl_tcp_if_include eth1" that should have forced
> things to use the interface you need. So it really comes down to why
> are we not connecting to the eth1 address? Are we failing on routing
> to that address or is the connect failing because we are trying to use
> a port that we are not really allowed to use or is it something else?
> I don't think it is a routing problem since you are able to reach each
> node via ssh. Is there someone else on the list that might want to
> lend a hand here? I feel like I am missing something obvious going on
>> Date: Fri, 19 Sep 2008 16:09:11 +0200
>> From: "Sofia Aparicio Secanellas" <saparicio_at_[hidden]>
>> Subject: Re: [OMPI users] Problem with MPI_Send and MPI_Recv
>> To: "Open MPI Users" <users_at_[hidden]>
>> Message-ID: <1BBF50FE29F743B5829CC3785F47CADD_at_aparicio1>
>> Content-Type: text/plain; charset="iso-8859-1"; Format="flowed"
>> Hello Terry,
>> I have installed 1.2.7 and I obtain the same result.
>> I will explain you what I have done.
>> 1. On my computer edu_at_10.1.10.240 I have added a new user called
>> sofia. This way I have sofia_at_10.1.10.208 and sofia_at_10.1.10.240.
>> 2. I have downloaded the openmpi 1.2.7 from the openmpi website on
>> both computers in /home/sofia/Desktop.
>> 3. I have installed everything using "sudo ./configure", "sudo make"
>> and "sudo make install".
>> 4. To make ssh not ask me for a password. I have typed in
>> sofia_at_10.1.10.208 "ssh-keygen -t dsa", "cd $HOME/.ssh" and "cp
>> id_dsa.pub authorized_keys". I have copied the directory
>> "/home/sofia/.ssh" from sofia_at_10.1.10.208 to /home/sofia/.ssh in
>> sofia_at_10.1.10.240. The ssh command without password works on
>> computer sofia_at_10.1.10.208 but computer sofia_at_10.1.10.208 ask me for
>> a passphrase and for the password. Is it normal?
>> 5. I have created a directory "/home/sofia/programasparalelos" on
>> both computers and I have given permissions to the directory with
>> "chmod 777".
>> 6. I have copied on both computers in
>> "/home/sofia/programasparalelos" the program "PruebaSumaParalela.c"
>> (I have changed a little bit the program, I enclose you the new
>> program) and I have compiled using "mpicc PruebaSumaParalela.c -o
>> 7. Now I run the program on both computersusing the command:
>> mpirun -np2 --host 10.1.10.208,10.1.10.240 --prefix /usr/local
>> When I run the program I obtain 3 PIDs executing on every computer,
>> 2 of "./PruebaSumaParalela.out" and 1 of "mpirun -np2 --host
>> 10.1.10.208,10.1.10.240 --prefix /usr/local
>> ./PruebaSumaParalela.out". I enclose you the results obtained on
>> every computer for every "./PruebaSumaParalela.out".
>> Thank you very much.