Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: [OMPI users] "ssh: connect to host XXX.XXX.XXX.XX port 22: connection timed out" errors during mpirun
From: vacate (vacatehoping_at_[hidden])
Date: 2013-06-04 12:48:41


Hello everyone,

After solving my first ssh_exchange_identification problem,
I feel embarrassed to ask my another problem... :'((

I got some "*ssh: connect to host XXX.XXX.XXX.XX port 22: connection timed
out*" errors
when I mpirun over 2000 times almost at the same time.

---
my bash shell script file :
   for (( index=0; index<2000 ; index++))
      do
          (time mpirun --hostfile my_hostfile openMPI_test &) >> file 2>&1
      done
---
*
*
*But not "always" got this problem, just "often"*.(It seldom works well.)
*In addition, the amount of "timed out" error in each test are different*.
(In 2000 times, this error happened between 0~200 times)
I try to google it,
but I can't find anyone have this ssh problem when he/she use a lot of ssh
connections...
So I think maybe someone here have had the same problem as mine.
----------------------------------------------------------------------------------
The following are some of my settings that I have tried to change :
1. net.ipv4.tcp_fin_timeout=180
http://askubuntu.com/questions/21182/how-to-change-the-default-timeout-of-internet-connection
2. sudo iptables -A INPUT -p tcp --dport ssh -j ACCEPT
http://www.serkey.com/ubuntu-ssh-connection-timed-out-due-to-firewall-behgct.html
----------------------------------------------------------------------------------
but these changes still didn't solve my problem...
I still can't figure out where is the problem and are there some potential
problems :(((
If someone here have any idea about this situation ,or have had the same
problem as mine?
Is it my *machine problem* or *system problem*? Or *OpenMPI* can't let me
do something like this?
Really hope someone can give me a hand ..
Thank you all very very very much!!
Best Wishes,
Jen