Also how can i find out where are my mpi libraries and include directories?
Let me explain in detail,
when we had only 2 nodes, 1 master (192.168.67.18) + 1 compute node (192.168.45.65)
my openmpi-default-hostfile looked like
192.168.67.18 slots=2
192.168.45.65 slots=2
after this on running the command miprun /work/Pi on master node we got
# root@192.168.45.65 password :
after entering the password the program ran on both de nodes.
Now after connecting a second compute node, and editing the hostfile:
192.168.67.18 slots=2
192.168.45.65 slots=2
192.168.67.241 slots=2
and then running the command miprun /work/Pi on master node we got
# root@192.168.45.65's password: root@192.168.67.241's password:
which does not accept the password.
Although we are trying to implement the passwordless cluster. i wud like to know what this problem is occuring?On Sat, Apr 18, 2009 at 3:40 AM, Gus Correa <gus@ldeo.columbia.edu> wrote:Ankush
You need to setup passwordless connections with ssh to the node you just
added. You (or somebody else) probably did this already on the first compute node, otherwise the MPI programs wouldn't run
across the network.
See the very last sentence on this FAQ:
http://www.open-mpi.org/faq/?category=running#run-prereqs
And try this recipe (if you use RSA keys instead of DSA, replace all "dsa" by "rsa"):
http://www.sshkeychain.org/mirrors/SSH-with-Keys-HOWTO/SSH-with-Keys-HOWTO-4.html#ss4.3Ankush Kaul wrote:
I hope this helps.
Gus Correa
---------------------------------------------------------------------
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------
Thank you, i m reading up on de tools u suggested.root@192.168.45.65 <mailto:root@192.168.45.65>'s password: root@192.168.67.241 <mailto:root@192.168.67.241>'s password:
I am facing another problem, my cluster is working fine with 2 hosts (1 master + 1 compute node) but when i tried 2 add another node (1 master + 2 compute node) its not working. it works fine when i give de command mpirun -host <hostname> /work/Pi
but when i try to run
mpirun /work/Pi it gives following error:
root@192.168.45.65 <mailto:root@192.168.45.65>'s password:
Permission denied, please try again. <The password i provide is correct>
root@192.168.45.65 <mailto:root@192.168.45.65>'s password:
Permission denied, please try again.
root@192.168.67.241 <mailto:root@192.168.67.241>'s password: [ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file base/pls_base_orted_cmds.c at line 275
Permission denied (publickey,gssapi-with-mic,password).
Permission denied, please try again.
[ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file pls_rsh_module.c at line 1166
[ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file errmgr_hnp.c at line 90
[ccomp1.cluster:03503] ERROR: A daemon on node 192.168.45.65 failed to start as expected.
[ccomp1.cluster:03503] ERROR: There may be more information available from
[ccomp1.cluster:03503] ERROR: the remote shell (see above).
[ccomp1.cluster:03503] ERROR: The daemon exited unexpectedly with status 255.
[ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file base/pls_base_orted_cmds.c at line 188
[ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file pls_rsh_module.c at line 1198
What is the problem here?
--------------------------------------------------------------------------
mpirun was unable to cleanly terminate the daemons for this job. Returned value Timeout instead of ORTE_SUCCESS
users@open-mpi.org <mailto:users@open-mpi.org> ------------------------------------------------------------------------On Tue, Apr 14, 2009 at 7:15 PM, Eugene Loh <Eugene.Loh@sun.com <mailto:Eugene.Loh@sun.com>> wrote:
Ankush Kaul wrote:
Finally, after mentioning the hostfiles the cluster is working
fine. We downloaded few benchmarking softwares but i would like
to know if there is any GUI based benchmarking software so that
its easier to demonstrate the working of our cluster while
displaying our cluster.
I'm confused what you're looking for here, but thought I'd venture a
suggestion.
There are GUI-based performance analysis and tracing tools. E.g.,
run a program, [[semi-]automatically] collect performance data, run
a GUI-based analysis tool on the data, visualize what happened on
your cluster. Would this suit your purposes?
If so, there are a variety of tools out there you could try. Some
are platform-specific or cost money. Some are widely/freely
available. Examples of these tools include Intel Trace Analyzer,
Jumpshot, Vampir, TAU, etc. I do know that Sun Studio (Performance
Analyzer) is available via free download on x86 and SPARC and Linux
and Solaris and works with OMPI. Possibly the same with Jumpshot.
VampirTrace instrumentation is already in OMPI, but then you need
to figure out the analysis-tool part. (I think the Vampir GUI tool
requires a license, but I'm not sure. Maybe you can convert to TAU,
which is probably available for free download.)
Anyhow, I don't even know if that sort of thing fits your
requirements. Just an idea.
_______________________________________________
users mailing list
_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users