Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] example/Hello_c.c : mpirun run failed on two physical nodes.
From: Jeff Squyres (jsquyres) (jsquyres_at_[hidden])
Date: 2014-03-25 10:09:17


Try this FAQ entry:

http://www.open-mpi.org/faq/?category=running#diagnose-multi-host-problems

On Mar 25, 2014, at 6:54 AM, "Wang,Yanfei(SYS)" <wangyanfei01_at_[hidden]> wrote:

> Hi,
>
> I am a fresh learner of OpenMPI programmer, and have some troubles on building mpi programming, hope some helps..
>
> The example/helloc_c can works successfully with 2 process on local machine, however, do not work on two separate physical node.
>
> Physical two nodes:
> Eg:
> [root_at_bb-nsi-ib04 examples]# mpirun -hostfile hosts -np 2 hello_c
> The command just return instantly without nothing printed.
> Local machine:
> [root_at_bb-nsi-ib04 examples]# mpirun -np 2 hello_c
> Hello, world, I am 0 of 2, (Open MPI v1.7.5, package: Open MPI root_at_bb-nsi-ib04.bb01.*.com Distribution, ident: 1.7.5, Mar 20, 2014, 108)
> Hello, world, I am 1 of 2, (Open MPI v1.7.5, package: Open MPI root_at_bb-nsi-ib04.bb01.*.com Distribution, ident: 1.7.5, Mar 20, 2014, 108)
> [root_at_bb-nsi-ib04 examples]#
> -----peer machine is ok--------
> [root_at_bb-nsi-ib03 examples]# mpirun -np 2 hello_c
> Hello, world, I am 0 of 2, (Open MPI v1.7.5, package: Open MPI root_at_bb-nsi-ib03.bb01.*.com Distribution, ident: 1.7.5, Mar 20, 2014, 108)
> Hello, world, I am 1 of 2, (Open MPI v1.7.5, package: Open MPI root_at_bb-nsi-ib03.bb01.*.com Distribution, ident: 1.7.5, Mar 20, 2014, 108)
> [root_at_bb-nsi-ib03 examples]#
> the command run successfully, and print two message!!
>
> Configuration details:
> OpenMPI version: 1.7.5
> Hostfile:
> [root_at_bb-nsi-ib04 examples]# cat hosts
> ib03 slots=1
> ib04 slots=1
> [root_at_bb-nsi-ib04 examples]#
> /etc/hosts:
> [root_at_bb-nsi-ib04 examples]# cat /etc/hosts
> 192.168.71.3 ib03
> 192.168.71.4 ib04
> SSH:
> Public rsa key is redistributed two machine, ib03 and ib04, and to do ssh login in without password is ok, I am sure.
>
> I am confused about this trouble, and anyone can help us? It have nothing log and error tip, could anyone tell me how to do diagnose it.
>
> BR
>
> Yanfei Wang
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: http://www.open-mpi.org/community/lists/devel/2014/03/14385.php

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/