Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] "ssh: connect to host XXX.XXX.XXX.XX port 22: connection timed out" errors during mpirun
From: Ralph Castain (rhc_at_[hidden])
Date: 2013-06-05 17:24:16


It has nothing to do with OMPI - this is an ssh issue. I suspect you are simply overwhelming the connection system.

Maybe you could tell us what you are actually trying to accomplish - running thousands of mpiruns in parallel seems a tad extreme.

On Jun 4, 2013, at 9:48 AM, vacate <vacatehoping_at_[hidden]> wrote:

> Hello everyone,
>
> After solving my first ssh_exchange_identification problem,
> I feel embarrassed to ask my another problem... :'((
>
> I got some "ssh: connect to host XXX.XXX.XXX.XX port 22: connection timed out" errors
> when I mpirun over 2000 times almost at the same time.
> ---
> my bash shell script file :
> for (( index=0; index<2000 ; index++))
> do
> (time mpirun --hostfile my_hostfile openMPI_test &) >> file 2>&1
> done
> ---
>
> But not "always" got this problem, just "often".(It seldom works well.)
> In addition, the amount of "timed out" error in each test are different.
> (In 2000 times, this error happened between 0~200 times)
>
> I try to google it,
> but I can't find anyone have this ssh problem when he/she use a lot of ssh connections...
> So I think maybe someone here have had the same problem as mine.
>
> ----------------------------------------------------------------------------------
>
> The following are some of my settings that I have tried to change :
>
> 1. net.ipv4.tcp_fin_timeout=180
> http://askubuntu.com/questions/21182/how-to-change-the-default-timeout-of-internet-connection
>
> 2. sudo iptables -A INPUT -p tcp --dport ssh -j ACCEPT
> http://www.serkey.com/ubuntu-ssh-connection-timed-out-due-to-firewall-behgct.html
>
> ----------------------------------------------------------------------------------
> but these changes still didn't solve my problem...
>
> I still can't figure out where is the problem and are there some potential problems :(((
>
> If someone here have any idea about this situation ,or have had the same problem as mine?
> Is it my machine problem or system problem? Or OpenMPI can't let me do something like this?
>
> Really hope someone can give me a hand ..
> Thank you all very very very much!!
>
>
> Best Wishes,
> Jen
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users