Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: [OMPI devel] ´ð¸´: ´ð¸´: example/Hello_c.c : mpirun run failed on two physical nodes.
From: Wang,Yanfei(SYS) (wangyanfei01_at_[hidden])
Date: 2014-03-27 01:37:56


Hi£¬

I found that the server installed OpenMPI arms with iptables, so further communication between ib03 and ib04 is prohibited! The mpirun works fine across multi-server with hello_c

Thanks Ralph !

Thanks
-Yanfei

·¢¼þÈË: devel [mailto:devel-bounces_at_[hidden]] ´ú±í Ralph Castain
·¢ËÍʱ¼ä: 2014Äê3ÔÂ26ÈÕ 17:45
ÊÕ¼þÈË: Open MPI Developers
Ö÷Ìâ: Re: [OMPI devel] ´ð¸´: example/Hello_c.c : mpirun run failed on two physical nodes.

Can you please configure OMPI with --enable-debug, and then execute
mpirun -mca plm_base_verbose 10 -host ib03 hostname
This will provide debug information about the problem.
Thanks
Ralph

On Tue, Mar 25, 2014 at 9:51 PM, Wang,Yanfei(SYS) <wangyanfei01_at_[hidden]<mailto:wangyanfei01_at_[hidden]>> wrote:
Hi,

Thanks jeff, and I have not figured out what happened yet with this FAQ.

1. Ssh remote login OK:
[root_at_bb-nsi-ib04 examples]# ssh ib03 hostname
bb-nsi-ib03.bb01.*.com
[root_at_bb-nsi-ib04 examples]#
2. following command return immediately without nothing returned
[root_at_bb-nsi-ib04 examples]# mpirun -host ib03 hostname
[root_at_bb-nsi-ib04 examples]#
3. following command excute successfully.
[root_at_bb-nsi-ib04 examples]# ssh ib03 mpirun
--------------------------------------------------------------------------
mpirun could not find anything to do.

It is possible that you forgot to specify how many processes to run
via the "-np" argument.
--------------------------------------------------------------------------
[root_at_bb-nsi-ib04 examples]#

So, does it seem like that the non-interactive shell profile is not correctly configured? Step 3 can execute succefully...

Hope any response!

BR

Yanfei Wang

-----ÓʼþÔ­¼þ-----
·¢¼þÈË: devel [mailto:devel-bounces_at_[hidden]<mailto:devel-bounces_at_[hidden]>] ´ú±í Jeff Squyres (jsquyres)
·¢ËÍʱ¼ä: 2014Äê3ÔÂ25ÈÕ 22:09
ÊÕ¼þÈË: Open MPI Developers
Ö÷Ìâ: Re: [OMPI devel] example/Hello_c.c : mpirun run failed on two physical nodes.

Try this FAQ entry:

http://www.open-mpi.org/faq/?category=running#diagnose-multi-host-problems

On Mar 25, 2014, at 6:54 AM, "Wang,Yanfei(SYS)" <wangyanfei01_at_[hidden]<mailto:wangyanfei01_at_[hidden]>> wrote:

> Hi,
>
> I am a fresh learner of OpenMPI programmer, and have some troubles on building mpi programming, hope some helps..
>
> The example/helloc_c can works successfully with 2 process on local machine, however, do not work on two separate physical node.
>
> Physical two nodes:
> Eg:
> [root_at_bb-nsi-ib04 examples]# mpirun -hostfile hosts -np 2 hello_c The
> command just return instantly without nothing printed.
> Local machine:
> [root_at_bb-nsi-ib04 examples]# mpirun -np 2 hello_c Hello, world, I am 0
> of 2, (Open MPI v1.7.5, package: Open MPI root_at_bb-nsi-ib04.bb01.*.com<mailto:root_at_bb-nsi-ib04.bb01.*.com>
> Distribution, ident: 1.7.5, Mar 20, 2014, 108) Hello, world, I am 1 of
> 2, (Open MPI v1.7.5, package: Open MPI root_at_bb-nsi-ib04.bb01.*.com<mailto:root_at_bb-nsi-ib04.bb01.*.com>
> Distribution, ident: 1.7.5, Mar 20, 2014, 108)
> [root_at_bb-nsi-ib04 examples]#
> -----peer machine is ok--------
> [root_at_bb-nsi-ib03 examples]# mpirun -np 2 hello_c Hello, world, I am 0
> of 2, (Open MPI v1.7.5, package: Open MPI root_at_bb-nsi-ib03.bb01.*.com<mailto:root_at_bb-nsi-ib03.bb01.*.com>
> Distribution, ident: 1.7.5, Mar 20, 2014, 108) Hello, world, I am 1 of
> 2, (Open MPI v1.7.5, package: Open MPI root_at_bb-nsi-ib03.bb01.*.com<mailto:root_at_bb-nsi-ib03.bb01.*.com>
> Distribution, ident: 1.7.5, Mar 20, 2014, 108)
> [root_at_bb-nsi-ib03 examples]#
> the command run successfully, and print two message!!
>
> Configuration details:
> OpenMPI version: 1.7.5
> Hostfile:
> [root_at_bb-nsi-ib04 examples]# cat hosts
> ib03 slots=1
> ib04 slots=1
> [root_at_bb-nsi-ib04 examples]#
> /etc/hosts:
> [root_at_bb-nsi-ib04 examples]# cat /etc/hosts
> 192.168.71.3 ib03
> 192.168.71.4 ib04
> SSH:
> Public rsa key is redistributed two machine, ib03 and ib04, and to do ssh login in without password is ok, I am sure.
>
> I am confused about this trouble, and anyone can help us? It have nothing log and error tip, could anyone tell me how to do diagnose it.
>
> BR
>
> Yanfei Wang
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]<mailto:devel_at_[hidden]>
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/03/14385.php


--
Jeff Squyres
jsquyres_at_[hidden]<mailto:jsquyres_at_[hidden]>
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/

_______________________________________________
devel mailing list
devel_at_[hidden]<mailto:devel_at_[hidden]>
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: http://www.open-mpi.org/community/lists/devel/2014/03/14386.php
_______________________________________________
devel mailing list
devel_at_[hidden]<mailto:devel_at_[hidden]>
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: http://www.open-mpi.org/community/lists/devel/2014/03/14396.php