Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] mpirun works locally but not through network
From: Jeff Squyres (jsquyres) (jsquyres_at_[hidden])
Date: 2010-04-28 06:28:59

So processes are running - good. The hang might then be occurring in the tcp wireup of mpi communications (ompi only lazily creates connections between processes).

What is the tcp setup between the two machines? (Ipaddr, netmask, etc.) Do you have any firewall software running?

Sent from my PDA. No type good.


From: users-bounces_at_[hidden] <users-bounces_at_[hidden]>
To: Open MPI Users <users_at_[hidden]>
Sent: Wed Apr 28 04:08:23 2010
Subject: Re: [OMPI users] mpirun works locally but not through network

Thanks for your suggestion !
"$ mpirun --host localhost,name_of_distant_machine hostname" works.
In fact, the simple programs that prints " I am process # " always works. Problem arises only if there is communication between process which are lying on two different computers.

I don't think it is a problem of ssh/rsh because it works well if i put only name_of_distant_machine in the --host. Maybe because of the two computers are virtuals and the network by default for mpirun is eth0 and not eth1 ?


--- En date de : Mar 27.4.10, Jeff Squyres <jsquyres_at_[hidden]> a écrit :

        De: Jeff Squyres <jsquyres_at_[hidden]>
        Objet: Re: [OMPI users] mpirun works locally but not through network
        Ã€: "Open MPI Users" <users_at_[hidden]>
        Date: Mardi 27 avril 2010, 7h46
        I'm not intimately familiar with boost++ -- you might want to try the "hello world" and "ring" example programs in the OMPI examples/ directory as a baseline.
        Additionally, try executing a non-MPI program such as "hostname" to verify that your remote connectivity is working. For example:
        $ mpirun --host localhost,name_of_distant_machine hostname
        You should see the output of both "hostname" executions. If you don't, check the process table and see if OMPI is trying to ssh or rsh over to the remote host, and see what is happening on the remote host. E.g., is that rsh or ssh being blocked? Or is it actually executing on the remote machine and hanging? Or ...?
        Ensure that you have the same version of OMPI installed on both machines and that both are in your default search PATH for non-interactive logins.
        Once you get something like "hostname" to work, it's much more likely that an MPI application will also work.
        On Apr 27, 2010, at 10:19 AM, Nguyen Kim Son wrote:
> Hi all,
> I'am writing a small program where the process of rank 0 sends "alo alo" to the process of rank 1 and then process 1 will show this message on screen. I am using boost++ library but result stays the same when I use the MPI standard.
> The program work locally ( that means: mpirun --host localhost), on the distant machine (mpirun --host name_of_distant_machine) but not on both ( mpirun --host localhost, name_of_distant_machine). There is no error message so i don't have any idea to resolve this.
> The machine I am running is a virtual one, and the distant machine too.
> Thank you in advance!
> Son.
> Nguyen Kim Son.
> Antibes, France
> Tel: +336 48 28 37 47
> <alo_example.cpp>_______________________________________________
> users mailing list
> users_at_[hidden]
        Jeff Squyres
        For corporate legal information go to:
        users mailing list