What version of OMPI are you using? That error message looks like something from an ancient version - might be worth updating.
On Dec 13, 2010, at 4:04 AM, peifan wrote:
> i have 3 nodes, one is master node and another is computing nodes,these nodes deployed in the internet (not in cluster)
>
> when i running NPB (NASA parallel benchmark) in one node (use 2 processes)
> mpirun -np 2 exe.
> I can get the successful result, but when i running in two nodes(for example running on B and C nodes) i got a fail
> mprirun -nolocal -hostfile hostfile -np 2 exe.
> the fail information is :
> B [0,1,0] connectimeout ,connect() fail errno=110
> C [0,1,1] connectimeout ,connect() fail errno=110
> but the connect between B and C has no problem, because i can use ping and ssh form B to C (or C to B).
> I think this problem may be caused by the para connectimeout (so little that lead fail?). Because my nodes deployed on internet so delay is bigger.
> who can help me attack this problem and how to set the connectimeout in openmpi?
>
>
>
>
> ç½æ163/126é®ç®±ç¾åç¾å
¼å®¹iphone ipadé®ä»¶æ¶å _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
|