Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] Problem with running openMPI program
From: Ankush Kaul (ankush.rkaul_at_[hidden])
Date: 2009-04-06 10:33:57


Thank you Sir the problem was with the paths of 'bin' and 'lib' folders so i
used de *mpirun --prefix* command. I want to run a program 'pi' now using
the cluster, so where do i place de file on de master and the compute nodes?

Also how do i come to know that the program is using resources of both the
nodes?

On Sat, Apr 4, 2009 at 7:05 PM, Jeff Squyres <jsquyres_at_[hidden]> wrote:

> It might be best to:
>
> 1. Setup a non-root user to run MPI applications
> 2. Setup SSH keys between the hosts for this non-root user so that you can
> "ssh <otherhost> uptime" and not be prompted for a password/passphrase
>
> This should help.
>
>
>
> On Apr 4, 2009, at 5:51 AM, Ankush Kaul wrote:
>
> I followed the steps given here to setup up openMPI cluster :
>> http://www.ps3cluster.umassd.edu/step3mpi.html
>>
>> My cluster consists of two nodes, master(192.168.67.18) and
>> salve(192.168.45.65), connected directly through a cross cable.
>>
>> After setting up the cluster n configuring the master node, i mounted
>> /tmp folder of master node on the slave node(i had some problems with nfs
>> at first but i worked my way out of it).
>>
>> Then i copied the 'pi.c' program in the /tmp folder and successfully
>> complied it, giving me a binary file 'pi'.
>>
>> Now when i try to run the binary file using the following command
>>
>> #mpirun –np 2 ./Pi
>>
>> root_at_192.168.45.65's password:
>> <it asks for the password>
>>
>> after entering the password it gives the following error:
>>
>> bash: orted: command not found
>> [ccomp.cluster:18963] [0,0,0] ORTE_ERROR_LOG: Timeout in file
>> base/pls_base_orted_cmds.c at line 275
>> [ccomp.cluster:18963] [0,0,0] ORTE_ERROR_LOG: Timeout in file
>> pls_rsh_module.c at line 1166
>> [ccomp.cluster:18963] [0,0,0] ORTE_ERROR_LOG: Timeout in file errmgr_hnp.c
>> at line 90
>> [ccomp.cluster:18963] ERROR: A daemon on node 192.168.45.65 failed to
>> start as expected.
>> [ccomp.cluster:18963] ERROR: There may be more information available from
>> [ccomp.cluster:18963] ERROR: the remote shell (see above).
>> [ccomp.cluster:18963] ERROR: The daemon exited unexpectedly with status
>> 127.
>> [ccomp.cluster:18963] [0,0,0] ORTE_ERROR_LOG: Timeout in file
>> base/pls_base_orted_cmds.c at line 188
>> [ccomp.cluster:18963] [0,0,0] ORTE_ERROR_LOG: Timeout in file
>> pls_rsh_module.c at line 1198
>> --------------------------------------------------------------------------
>> mpirun was unable to cleanly terminate the daemons for this job. Returned
>> value Timeout instead of ORTE_SUCCESS.
>> --------------------------------------------------------------------------
>>
>> I am totally lost now, as this is the first time i am working on a cluster
>> project, and need some help
>>
>> Thank you
>> Ankush
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
> --
> Jeff Squyres
> Cisco Systems
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>