Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Problem with MPI_Send and MPI_Recv
From: Sofia Aparicio Secanellas (saparicio_at_[hidden])
Date: 2008-09-23 11:05:22


Hello Terry,

Here you can find the files.

Thank you very much.

Sofia
----- Original Message -----
From: "Terry Dontje" <Terry.Dontje_at_[hidden]>
To: <users_at_[hidden]>
Sent: Tuesday, September 23, 2008 4:23 PM
Subject: Re: [OMPI users] Problem with MPI_Send and MPI_Recv

> Hello Sofia,
>
> After talking with another OMPI member can you humor me and do
> "/sbin/iptables -L" on both your machines. You'll need to be root to
> do such.
>
> --td
>
>
> Date: Tue, 23 Sep 2008 06:02:30 -0400
> From: Terry Dontje <Terry.Dontje_at_[hidden]>
> Subject: Re: [OMPI users] Problem with MPI_Send and MPI_Recv
> To: users_at_[hidden]
> Message-ID: <48D8BEB6.8040901_at_[hidden]>
> Content-Type: text/plain; format=flowed; charset=ISO-8859-1
>
> Hello Sofia, Looking at your stack trace it is what I thought was
> happening and that is one process is stuck trying to connect to the other.
> The stack unfortunately does not give enough information as to why. The
> only suggestion I could give is walk through a debuggable version of the
> code from ompi_init_do_preconnect and see if you can find where the
> process is calling connect and see if the connect call is failing. If you
> don't have a firewall I am not sure what is then blocking the connection
> from happening. Either the address somehow is being mashed up or something
> else. --td Date: Mon, 22 Sep 2008 10:49:41 +0200 From: "Sofia Aparicio
> Secanellas" <saparicio_at_[hidden]> Subject: Re: [OMPI users] Problem
> with MPI_Send and MPI_Recv To: "Open MPI Users" <users_at_[hidden]>
> Message-ID: <2F607CC2B43A422B80CEBBD540BFFE8B_at_aparicio1> Content-Type:
> text/plain; charset="iso-8859-1"; Format="flowed" Hello Terry, I do not
> have an active firewall. I have typed on both computers: netstat -lnut I
> enclose you the results. I have also written on both computers: mpirun -np
> 2 --host 10.1.10.208,10.1.10.240 --mca mpi_preconnect_all 1 --prefix
> /usr/local -mca btl self,tcp -mca btl_tcp_if_include eth1
> ./PruebaSumaParalela.out I enclose you the results. Thank you. Sofia -----
> Original Message ----- From: "Terry Dontje" <Terry.Dontje_at_[hidden]> To:
> <users_at_[hidden]> Sent: Friday, September 19, 2008 7:54 PM Subject: Re:
> [OMPI users] Problem with MPI_Send and MPI_Recv
>
>
>>> > > Hello Sofia,
>>> > >
>>> > > After further reflection I wonder if you have a firewall that is
>>> > > preventing connections to certain ports.
>>> > >
>>> > > --td
>>> > >
>>> > > Terry Dontje wrote:
>>>
>>
>>>>> >> >> Hello Sofia,
>>>>> >> >>
>>>>> >> >> Ok, so I really wanted the stack of when you run with "-mca
>>>>> >> >> mpi_preconnect_all 1" I believe you'll see that one of the
>>>>> >> >> processes will be in init. However, the stack still probably
>>>>> >> >> will not help me help you. What needs to happen is to step
>>>>> >> >> through the code in dbx while the connection is trying to be
>>>>> >> >> established. I am hoping you might find the connect call fails
>>>>> >> >> or that we've been given an interface that somehow cannot reach
>>>>> >> >> the other node. However, when you specified "-mca
>>>>> >> >> btl_tcp_if_include eth1" that should have forced things to use
>>>>> >> >> the interface you need. So it really comes down to why are we
>>>>> >> >> not connecting to the eth1 address? Are we failing on routing
>>>>> >> >> to that address or is the connect failing because we are trying
>>>>> >> >> to use a port that we are not really allowed to use or is it
>>>>> >> >> something else?
>>>>> >> >>
>>>>> >> >> I don't think it is a routing problem since you are able to
>>>>> >> >> reach each node via ssh. Is there someone else on the list that
>>>>> >> >> might want to lend a hand here? I feel like I am missing
>>>>> >> >> something obvious going on here.
>>>>> >> >>
>>>>> >> >> --td
>>>>>
>>>
>>>>>>> >>> >>> Date: Fri, 19 Sep 2008 16:09:11 +0200
>>>>>>> >>> >>> From: "Sofia Aparicio Secanellas"
>>>>>>> >>> >>> <saparicio_at_[hidden]>
>>>>>>> >>> >>> Subject: Re: [OMPI users] Problem with MPI_Send and MPI_Recv
>>>>>>> >>> >>> To: "Open MPI Users" <users_at_[hidden]>
>>>>>>> >>> >>> Message-ID: <1BBF50FE29F743B5829CC3785F47CADD_at_aparicio1>
>>>>>>> >>> >>> Content-Type: text/plain; charset="iso-8859-1";
>>>>>>> >>> >>> Format="flowed"
>>>>>>> >>> >>>
>>>>>>> >>> >>> Hello Terry,
>>>>>>> >>> >>>
>>>>>>> >>> >>> I have installed 1.2.7 and I obtain the same result.
>>>>>>> >>> >>>
>>>>>>> >>> >>> I will explain you what I have done.
>>>>>>> >>> >>>
>>>>>>> >>> >>> 1. On my computer edu_at_10.1.10.240 I have added a new user
>>>>>>> >>> >>> called sofia. This way I have sofia_at_10.1.10.208 and
>>>>>>> >>> >>> sofia_at_10.1.10.240.
>>>>>>> >>> >>> 2. I have downloaded the openmpi 1.2.7 from the openmpi
>>>>>>> >>> >>> website on both computers in /home/sofia/Desktop.
>>>>>>> >>> >>> 3. I have installed everything using "sudo ./configure",
>>>>>>> >>> >>> "sudo make" and "sudo make install".
>>>>>>> >>> >>> 4. To make ssh not ask me for a password. I have typed in
>>>>>>> >>> >>> sofia_at_10.1.10.208 "ssh-keygen -t dsa", "cd $HOME/.ssh" and
>>>>>>> >>> >>> "cp id_dsa.pub authorized_keys". I have copied the directory
>>>>>>> >>> >>> "/home/sofia/.ssh" from sofia_at_10.1.10.208 to
>>>>>>> >>> >>> /home/sofia/.ssh in sofia_at_10.1.10.240. The ssh command
>>>>>>> >>> >>> without password works on computer sofia_at_10.1.10.208 but
>>>>>>> >>> >>> computer sofia_at_10.1.10.208 ask me for a passphrase and for
>>>>>>> >>> >>> the password. Is it normal?
>>>>>>> >>> >>> 5. I have created a directory
>>>>>>> >>> >>> "/home/sofia/programasparalelos" on both computers and I
>>>>>>> >>> >>> have given permissions to the directory with "chmod 777".
>>>>>>> >>> >>> 6. I have copied on both computers in
>>>>>>> >>> >>> "/home/sofia/programasparalelos" the program
>>>>>>> >>> >>> "PruebaSumaParalela.c" (I have changed a little bit the
>>>>>>> >>> >>> program, I enclose you the new program) and I have compiled
>>>>>>> >>> >>> using "mpicc PruebaSumaParalela.c -o
>>>>>>> >>> >>> PruebaSumaParalela.out".
>>>>>>> >>> >>>
>>>>>>> >>> >>> 7. Now I run the program on both computersusing the command:
>>>>>>> >>> >>>
>>>>>>> >>> >>> mpirun -np2 --host 10.1.10.208,10.1.10.240 --prefix
>>>>>>> >>> >>> /usr/local ./PruebaSumaParalela.out
>>>>>>> >>> >>>
>>>>>>> >>> >>> When I run the program I obtain 3 PIDs executing on every
>>>>>>> >>> >>> computer, 2 of "./PruebaSumaParalela.out" and 1 of
>>>>>>> >>> >>> "mpirun -np2 --host 10.1.10.208,10.1.10.240 --prefix
>>>>>>> >>> >>> /usr/local ./PruebaSumaParalela.out". I enclose you the
>>>>>>> >>> >>> results obtained on every computer for every
>>>>>>> >>> >>> "./PruebaSumaParalela.out".
>>>>>>> >>> >>>
>>>>>>> >>> >>> Thank you very much.
>>>>>>> >>> >>>
>>>>>>> >>> >>> Sofia
>>>>>>> >>> >>>
>>>>>>>
>>>>
>>>>> >> >>
>>>>> >> >>
>>>>>
>>>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> No virus found in this incoming message
> Checked by PC Tools AntiVirus (4.0.0.26 - 10.100.007).
> http://www.pctools.com/free-antivirus/

No virus found in this outgoing message
Checked by PC Tools AntiVirus (4.0.0.26 - 10.100.007).
http://www.pctools.com/free-antivirus/