Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] ´ð¸´: example/Hello_c.c : mpirun run failed on two physical nodes.
From: Ralph Castain (rhc_at_[hidden])
Date: 2014-03-26 05:45:02


Can you please configure OMPI with --enable-debug, and then execute

mpirun -mca plm_base_verbose 10 -host ib03 hostname

This will provide debug information about the problem.

Thanks
Ralph

On Tue, Mar 25, 2014 at 9:51 PM, Wang,Yanfei(SYS) <wangyanfei01_at_[hidden]>wrote:

> Hi,
>
> Thanks jeff, and I have not figured out what happened yet with this FAQ.
>
> 1. Ssh remote login OK:
> [root_at_bb-nsi-ib04 examples]# ssh ib03 hostname
> bb-nsi-ib03.bb01.*.com
> [root_at_bb-nsi-ib04 examples]#
> 2. following command return immediately without nothing returned
> [root_at_bb-nsi-ib04 examples]# mpirun -host ib03 hostname
> [root_at_bb-nsi-ib04 examples]#
> 3. following command excute successfully.
> [root_at_bb-nsi-ib04 examples]# ssh ib03 mpirun
> --------------------------------------------------------------------------
> mpirun could not find anything to do.
>
> It is possible that you forgot to specify how many processes to run
> via the "-np" argument.
> --------------------------------------------------------------------------
> [root_at_bb-nsi-ib04 examples]#
>
> So, does it seem like that the non-interactive shell profile is not
> correctly configured? Step 3 can execute succefully...
>
> Hope any response!
>
> BR
>
> Yanfei Wang
>
> -----ÓʼþÔ­¼þ-----
> ·¢¼þÈË: devel [mailto:devel-bounces_at_[hidden]] ´ú±í Jeff Squyres (jsquyres)
> ·¢ËÍʱ¼ä: 2014Äê3ÔÂ25ÈÕ 22:09
> ÊÕ¼þÈË: Open MPI Developers
> Ö÷Ìâ: Re: [OMPI devel] example/Hello_c.c : mpirun run failed on two physical
> nodes.
>
> Try this FAQ entry:
>
> http://www.open-mpi.org/faq/?category=running#diagnose-multi-host-problems
>
> On Mar 25, 2014, at 6:54 AM, "Wang,Yanfei(SYS)" <wangyanfei01_at_[hidden]>
> wrote:
>
> > Hi,
> >
> > I am a fresh learner of OpenMPI programmer, and have some troubles on
> building mpi programming, hope some helps..
> >
> > The example/helloc_c can works successfully with 2 process on local
> machine, however, do not work on two separate physical node.
> >
> > Physical two nodes:
> > Eg:
> > [root_at_bb-nsi-ib04 examples]# mpirun -hostfile hosts -np 2 hello_c The
> > command just return instantly without nothing printed.
> > Local machine:
> > [root_at_bb-nsi-ib04 examples]# mpirun -np 2 hello_c Hello, world, I am 0
> > of 2, (Open MPI v1.7.5, package: Open MPI root_at_bb-nsi-ib04.bb01.*.com
> > Distribution, ident: 1.7.5, Mar 20, 2014, 108) Hello, world, I am 1 of
> > 2, (Open MPI v1.7.5, package: Open MPI root_at_bb-nsi-ib04.bb01.*.com
> > Distribution, ident: 1.7.5, Mar 20, 2014, 108)
> > [root_at_bb-nsi-ib04 examples]#
> > -----peer machine is ok--------
> > [root_at_bb-nsi-ib03 examples]# mpirun -np 2 hello_c Hello, world, I am 0
> > of 2, (Open MPI v1.7.5, package: Open MPI root_at_bb-nsi-ib03.bb01.*.com
> > Distribution, ident: 1.7.5, Mar 20, 2014, 108) Hello, world, I am 1 of
> > 2, (Open MPI v1.7.5, package: Open MPI root_at_bb-nsi-ib03.bb01.*.com
> > Distribution, ident: 1.7.5, Mar 20, 2014, 108)
> > [root_at_bb-nsi-ib03 examples]#
> > the command run successfully, and print two message!!
> >
> > Configuration details:
> > OpenMPI version: 1.7.5
> > Hostfile:
> > [root_at_bb-nsi-ib04 examples]# cat hosts
> > ib03 slots=1
> > ib04 slots=1
> > [root_at_bb-nsi-ib04 examples]#
> > /etc/hosts:
> > [root_at_bb-nsi-ib04 examples]# cat /etc/hosts
> > 192.168.71.3 ib03
> > 192.168.71.4 ib04
> > SSH:
> > Public rsa key is redistributed two machine, ib03 and ib04, and to do
> ssh login in without password is ok, I am sure.
> >
> > I am confused about this trouble, and anyone can help us? It have
> nothing log and error tip, could anyone tell me how to do diagnose it.
> >
> > BR
> >
> > Yanfei Wang
> >
> >
> > _______________________________________________
> > devel mailing list
> > devel_at_[hidden]
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post:
> > http://www.open-mpi.org/community/lists/devel/2014/03/14385.php
>
>
> --
> Jeff Squyres
> jsquyres_at_[hidden]
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/03/14386.php
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/03/14396.php