Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Problem with getting started [solved]
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2008-06-21 09:18:04


It could also have been that you didn't have exactly matching
installations on both machines. Even if they were the same version,
if they weren't configured / installed the same way on both machines,
this could have led to problems. Also be sure that either the MPI
application is compatible / runnable on both systems or you have a
separate MPI application binary for each system (e.g., to account for
glibc and other differences between your two OS's).

Running in heterogeneous situations like that is quite difficult to
do, and not for the meek. :-)

On Jun 13, 2008, at 2:12 AM, Manuel Freiberger wrote:

> Hello,
>
> Well, actually I'm quite sure that it was not the firewall because I
> had to
> turn it off as otherwise no connection could be established. So my
> iptables --list
> returns
>
> Chain INPUT (policy ACCEPT)
> target prot opt source destination
>
> Chain FORWARD (policy ACCEPT)
> target prot opt source destination
>
> Chain OUTPUT (policy ACCEPT)
> target prot opt source destination
>
> on both machines. After reinstalling OMPI, I did not make any
> changes to the
> firewall but now it works without problems. Probably installing the
> library
> with exactly the same configuration (same --prefix and so on) did
> the trick.
>
> But nonetheless, thank you very much for your hint! :-)
>
> Best regards,
> Manuel
>
> On Thursday 12 June 2008 18:23, Rainer Keller wrote:
>> Hi,
>> are You sure it was not a Firewall issue on the Suse 10.2?
>> If there are any connections from the Gentoo machine trying to
>> access the
>> orted on the Suse, check in /var/log/firewall.
>>
>> For the time being, try stopping the firewall by (as root) with
>> /etc/init.d/SuSEfirewall2_setup stop
>> and test whether it works ,-]
>>
>> With best regards,
>> Rainer
>>
>> On Donnerstag, 12. Juni 2008, Manuel Freiberger wrote:
>>> Hi!
>>>
>>> Ok, I found the problem. I reinstallen OMPI on both PCs but this
>>> time
>>> only locally in the users home directory. Now, the sample code works
>>> perfectly. I'm not sure where the error really was located. It
>>> could be
>>> that it was a problem with the Gentoo installation because OMPI is
>>> still
>>> marked unstable in portage (~x86 keyword).
>>>
>>> Best regards,
>>> Manuel
>>>
>>> On Wednesday 11 June 2008 18:52, Manuel Freiberger wrote:
>>>> Hello everybody!
>>>>
>>>> First of all I wanted to point out that I'm beginner regarding
>>>> openMPI
>>>> and all I try to achieve is to get a simple program working on
>>>> two PCs.
>>>> So far I've installed openMPI 1.2.6 on two PCs (one running
>>>> OpenSUSE
>>>> 10.2, the other one Gentoo).
>>>> I set up two identical users on both systems and made sure that I
>>>> can
>>>> make an SSH connection between them using private/public key
>>>> authentication.
>>>>
>>>> Next I ran the command
>>>> mpirun -np 2 --hostfile myhosts uptime
>>>> which gave the result
>>>> 6:41pm up 1 day 5:16, 4 users, load average: 0.00, 0.07, 0.17
>>>> 18:43:45 up 7:36, 6 users, load average: 0.00, 0.02, 0.05
>>>> so I concluded that MPI should work in principle.
>>>>
>>>> Next I tried the following code which I copied from Boost.MPI:
>>>> ---- snip
>>>> #include <mpi.h>
>>>> #include <iostream>
>>>>
>>>> int main(int argc, char* argv[])
>>>> {
>>>> MPI_Init(&argc, &argv);
>>>> int rank;
>>>> MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>>>> if (rank == 0)
>>>> {
>>>> std::cout << "Rank 0 is going to send" << std::endl;
>>>> int value = 17;
>>>> int result = MPI_Send(&value, 1, MPI_INT, 1, 0, MPI_COMM_WORLD);
>>>> if (result == MPI_SUCCESS)
>>>> std::cout << "Rank 0 OK!" << std::endl;
>>>> }
>>>> else if (rank == 1)
>>>> {
>>>> std::cout << "Rank 1 is waiting for answer" << std::endl;
>>>> int value;
>>>> MPI_Status status;
>>>> int result = MPI_Recv(&value, 1, MPI_INT, 0, 0, MPI_COMM_WORLD,
>>>> &status);
>>>> if (result == MPI_SUCCESS && value == 17)
>>>> std::cout << "Rank 1 OK!" << std::endl;
>>>> }
>>>> MPI_Finalize();
>>>> return 0;
>>>> }
>>>> ---- snap
>>>>
>>>> Starting a parallel job using
>>>> mpirun -np 2 --hostfile myhosts mpi-test
>>>> I get the answer
>>>> Rank 0 is going to send
>>>> Rank 1 is waiting for answer
>>>> Rank 0 OK!
>>>> and than the program locks. So the strange thing is that
>>>> obviously the
>>>> recv()-command is blocking, which is what I do not understand.
>>>>
>>>> Could anybody provide some hints, where I should start looking
>>>> for the
>>>> mistake? Any help is welcome!
>>>>
>>>> Best regards,
>>>> Manuel
>
> --
> Manuel Freiberger
> Institute of Medical Engineering
> Graz University of Technology
> manuel.freiberger_at_[hidden]
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
Cisco Systems