Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Problem with running openMPI program
From: Ankush Kaul (ankush.rkaul_at_[hidden])
Date: 2009-04-17 17:31:16


Thank you, i m reading up on de tools u suggested.
I am facing another problem, my cluster is working fine with 2 hosts (1
master + 1 compute node) but when i tried 2 add another node (1 master + 2
compute node) its not working. it works fine when i give de command
mpirun -host <hostname> /work/Pi

but when i try to run
mpirun /work/Pi it gives following error:

root_at_192.168.45.65's password: root_at_192.168.67.241's password:

Permission denied, please try again. <The password i provide is correct>

root_at_192.168.45.65's password:

Permission denied, please try again.

root_at_192.168.45.65's password:

Permission denied (publickey,gssapi-with-mic,password).

Permission denied, please try again.

root_at_192.168.67.241's password: [ccomp1.cluster:03503] [0,0,0]
ORTE_ERROR_LOG: Timeout in file base/pls_base_orted_cmds.c at line 275

[ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file
pls_rsh_module.c at line 1166

[ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file errmgr_hnp.c
at line 90

[ccomp1.cluster:03503] ERROR: A daemon on node 192.168.45.65 failed to start
as expected.

[ccomp1.cluster:03503] ERROR: There may be more information available from

[ccomp1.cluster:03503] ERROR: the remote shell (see above).

[ccomp1.cluster:03503] ERROR: The daemon exited unexpectedly with status
255.

[ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file
base/pls_base_orted_cmds.c at line 188

[ccomp1.cluster:03503] [0,0,0] ORTE_ERROR_LOG: Timeout in file
pls_rsh_module.c at line 1198

What is the problem here?

--------------------------------------------------------------------------

mpirun was unable to cleanly terminate the daemons for this job. Returned
value Timeout instead of ORTE_SUCCESS

On Tue, Apr 14, 2009 at 7:15 PM, Eugene Loh <Eugene.Loh_at_[hidden]> wrote:

> Ankush Kaul wrote:
>
> Finally, after mentioning the hostfiles the cluster is working fine. We
>> downloaded few benchmarking softwares but i would like to know if there is
>> any GUI based benchmarking software so that its easier to demonstrate the
>> working of our cluster while displaying our cluster.
>>
>
> I'm confused what you're looking for here, but thought I'd venture a
> suggestion.
>
> There are GUI-based performance analysis and tracing tools. E.g., run a
> program, [[semi-]automatically] collect performance data, run a GUI-based
> analysis tool on the data, visualize what happened on your cluster. Would
> this suit your purposes?
>
> If so, there are a variety of tools out there you could try. Some are
> platform-specific or cost money. Some are widely/freely available.
> Examples of these tools include Intel Trace Analyzer, Jumpshot, Vampir,
> TAU, etc. I do know that Sun Studio (Performance Analyzer) is available via
> free download on x86 and SPARC and Linux and Solaris and works with OMPI.
> Possibly the same with Jumpshot. VampirTrace instrumentation is already in
> OMPI, but then you need to figure out the analysis-tool part. (I think the
> Vampir GUI tool requires a license, but I'm not sure. Maybe you can convert
> to TAU, which is probably available for free download.)
>
> Anyhow, I don't even know if that sort of thing fits your requirements.
> Just an idea.
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>