On Apr 19, 2007, at 11:27 PM, Babu Bhai wrote:
> I have already seen this faq. Nodes in cluster does not have
> multiple IP addresses. One thing i forgot to mention is that
> systems in cluster does not have static IPs and get IP address
> through DHCP.
Ok, that should be fine.
> Also if there is a print statement (printf("hello world\n"); ) in
> slave it is correctly printed on masters consoles but none of MPI
> commands work.
I'm not sure I follow -- which MPI commands are you referring to,
mpirun? Something else?
I think you're saying that the MPI job starts up, printf works fine,
but then something goes bad...? Are you saying that MPI *functions*
don't seem to work (like MPI_SEND)? (I'm a little confused by your
use of the word "command")
If that is the case, then this is a bit more odd because it means
that OMPI started up, launched your job, and did some "out of band"
communication, but then failed the first time it tried to establish
Are you running any firewall or port-blocking software on either of
the nodes? Is each node routable from the other? (in Linux, at
least, errno 113 is "no route to host", which would tend to imply
that one host could not open a socket to another because it couldn't
> >I need to make that error string be google-able -- I'll add it to the
> >faq. :-)
> >The problem is likely that you have multiple IP addresses, some of
> >which are not routable to each other (but fail OMPI's routability
> >assumptions). Check out these FAQ entries:
> >Does this help?
> >On Apr 19, 2007, at 11:07 AM, Babu Bhai wrote:
> >> I have migrated from LAM/MPI to OpenMPI. I am not able to
> >> execute simple mpi code in which master sends an integer to slave.
> >> If i execute code on single machine i.e start 2 instance on same
> >> machine (mpirun -np 2 hello) this works fine.
> >> If i execute in cluster using mpirun --prefix /usr /local -
> >> np 2 --host 18.104.22.168,22.214.171.124 hello
> >> it gives following error "btl_tcp_endpoint.c:
> >> 572:mca_btl_tcp_endpoint_complete_connect] connect() failed with
> >> errno=113"
> > >I am using openmpi-1.2
> > >regards,
> > >Abhishek
> > >_______________________________________________
> > >users mailing list
> > >users_at_[hidden]
> > >http://www.open-mpi.org/mailman/listinfo.cgi/users
> >Jeff Squyres
> >Cisco Systems
> users mailing list