Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Running openMPI job with torque
From: Gus Correa (gus_at_[hidden])
Date: 2010-06-09 13:45:31


Hi Govind

Besides what Ralph said, make sure your OpenMPI was
built with Torque ("tm") support.

Suggestion:
Do:

ompi_info --all | grep tm

It should show lines like these:

MCA ras: tm (MCA v2.0, API v2.0, Component v1.4.2)
MCA plm: tm (MCA v2.0, API v2.0, Component v1.4.2)
...

***

If your OpenMPI doesn't have torque support,
you may need to add the nodes list to your mpirun command.

Suggestion:

/usr/lib64/openmpi/1.4-gcc/bin/mpirun -hostfile $PBS_NODEFILE -np 4 ./hello

***

Also, assuming your OpenMPI has torque support:

Did you request 4 nodes from torque?

If you don't request the nodes and processors,
torque will give you the default values
(which may be one processor and one node).

Suggestion:

A script like this (adjusted to your site), tcsh style here,
say, called run_my_pbs_job.tcsh:

*********

#! /bin/tcsh
#PBS -l nodes=4:ppn=1
#PBS -q default_at_your.torque.server
#PBS -N myjob
cd $PBS_O_WORKDIR
/usr/lib64/openmpi/1.4-gcc/bin/mpirun -np 4 ./hello

*********

Then do:
qsub run_my_pbs_job.tcsh

**

You can get more information about the PBS syntax using "man qsub".

**

I hope this helps,
Gus Correa
---------------------------------------------------------------------
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------

Ralph Castain wrote:
>
> On Jun 9, 2010, at 10:00 AM, Govind Songara wrote:
>
>> Thanks Ralph after giving full path of hello it runs.
>> But it run only on one rank
>> Hello World! from process 0 out of 1 on node56.beowulf.cluster
>
> Just to check things out, I would do:
>
> mpirun --display-allocation --display-map -np 4 ....
>
> That should show you the allocation and where OMPI is putting the procs.
>
>> there also a error
>> >cat my-script.sh.e43
>> stty: standard input: Invalid argument
>
> Not really sure here - must be an error in the script itself.
>
>>
>>
>>
>> On 9 June 2010 16:46, Ralph Castain <rhc_at_[hidden]
>> <mailto:rhc_at_[hidden]>> wrote:
>>
>> You need to include the path to "hello" unless it sits in your
>> PATH environment!
>>
>> On Jun 9, 2010, at 9:37 AM, Govind wrote:
>>
>>>
>>> #!/bin/sh
>>> /usr/lib64/openmpi/1.4-gcc/bin/mpirun hello
>>>
>>>
>>> On 9 June 2010 16:21, David Zhang <solarbikedz_at_[hidden]
>>> <mailto:solarbikedz_at_[hidden]>> wrote:
>>>
>>> what does your my-script.sh looks like?
>>>
>>> On Wed, Jun 9, 2010 at 8:17 AM, Govind <govind.rhul_at_[hidden]
>>> <mailto:govind.rhul_at_[hidden]>> wrote:
>>>
>>> Hi,
>>>
>>> I have installed following openMPI packge on worker node
>>> from repo
>>> openmpi-libs-1.4-4.el5.x86_64
>>> openmpi-1.4-4.el5.x86_64
>>> mpitests-openmpi-3.0-2.el5.x86_64
>>> mpi-selector-1.0.2-1.el5.noarch
>>>
>>> torque-client-2.3.6-2cri.el5.x86_64
>>> torque-2.3.6-2cri.el5.x86_64
>>> torque-mom-2.3.6-2cri.el5.x86_64
>>>
>>>
>>> Having some problem on running MPI jobs with torque
>>> qsub -q long -l nodes=4 my-script.sh
>>> 42.pbs1 <http://42.pbs1.pp.rhul.ac.uk/>
>>>
>>> cat my-script.sh.e41
>>> stty: standard input: Invalid argument
>>> --------------------------------------------------------------------------
>>> mpirun was unable to launch the specified application as
>>> it could not find an executable:
>>>
>>> Executable: hello
>>> Node: node56.beowulf.cluster
>>>
>>> while attempting to start process rank 0.
>>> ==================================
>>>
>>> I could run the binary directly on the node without any
>>> problem.
>>> mpiexec -n 4 hello
>>> Hello World! from process 2 out of 4 on
>>> node56.beowulf.cluster
>>> Hello World! from process 0 out of 4 on
>>> node56.beowulf.cluster
>>> Hello World! from process 3 out of 4 on
>>> node56.beowulf.cluster
>>> Hello World! from process 1 out of 4 on
>>> node56.beowulf.cluster
>>>
>>> Could you please advise, if I missing anything here.
>>>
>>>
>>> Regards
>>> Govind
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden] <mailto:users_at_[hidden]>
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>>
>>>
>>> --
>>> David Zhang
>>> University of California, San Diego
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden] <mailto:users_at_[hidden]>
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden] <mailto:users_at_[hidden]>
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden] <mailto:users_at_[hidden]>
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden] <mailto:users_at_[hidden]>
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users