Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: [OMPI users] having trouble expanding our cluster
From: yewyong (uyong81_at_[hidden])
Date: 2009-10-20 09:33:32


Hi guys,

am new to cluster computing, so please bear with me.
currently trying to utilize the available desktops found in labs.

Started trying out 2 labs, (namely Lab A which have 192.168.0.xx IPs
and Lab B which have public IPs 202.185.77.xx series)

Have been running Paraview in Lab A without any problem so far.
(mpirun -np 18 -machinefile quad_lan pvserver)

As i tried to extend the cluster to Lab B's desktops, by specifying their
IPs in
my machinefile like:

202.185.77.110 slots=4 max-slots=4 *#(eth1 = 202.185.77.110; eth0 =
192.168.0.10)*
202.185.77.219 slots=2 max-slots=2 *<-- this is "pc226*"
192.168.0.227 slots=2 max-slots=2

my programs hangs while my client tries to connect to the pvserver.
if i wait long enough, it will show the error

yewyong_at_vrc1:~/installer/mpi_test> mpirun -np 8 -machinefile quad_hama
pvserver --use-offscreen-rendering
Listen on port: 11111
Waiting for client...
Client connected.
[pc226][[8636,1],6][btl_tcp_endpoint.c:631:mca_btl_tcp_endpoint_complete_connect]
connect() failed: Connection timed out (110)

note that when running "hello_world", it shows that the makeshift cluster
are able to communicate which each other.
after trying out different sequence in the machinefile, i found out that
whenever a cross IPs mpirun server is started, the system pauses
when the client tries to connect.

Would it be related to the problem raised in "
http://www.open-mpi.org/community/lists/devel/2009/07/6385.php"?

i'm not too sure if i've fully described the situation, please let me know
if you guys need any extra information.

thank you in advance.

yewyong.