Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] Running openMPI job with torque
From: Gus Correa (gus_at_[hidden])
Date: 2010-06-09 13:45:31


Hi Govind

Besides what Ralph said, make sure your OpenMPI was
built with Torque ("tm") support.

Suggestion:
Do:

ompi_info --all | grep tm

It should show lines like these:

MCA ras: tm (MCA v2.0, API v2.0, Component v1.4.2)
MCA plm: tm (MCA v2.0, API v2.0, Component v1.4.2)
...

***

If your OpenMPI doesn't have torque support,
you may need to add the nodes list to your mpirun command.

Suggestion:

/usr/lib64/openmpi/1.4-gcc/bin/mpirun -hostfile $PBS_NODEFILE -np 4 ./hello

***

Also, assuming your OpenMPI has torque support:

Did you request 4 nodes from torque?

If you don't request the nodes and processors,
torque will give you the default values
(which may be one processor and one node).

Suggestion:

A script like this (adjusted to your site), tcsh style here,
say, called run_my_pbs_job.tcsh:

*********

#! /bin/tcsh
#PBS -l nodes=4:ppn=1
#PBS -q default_at_your.torque.server
#PBS -N myjob
cd $PBS_O_WORKDIR
/usr/lib64/openmpi/1.4-gcc/bin/mpirun -np 4 ./hello

*********

Then do:
qsub run_my_pbs_job.tcsh

**

You can get more information about the PBS syntax using "man qsub".

**

I hope this helps,
Gus Correa
---------------------------------------------------------------------
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------

Ralph Castain wrote:
>
> On Jun 9, 2010, at 10:00 AM, Govind Songara wrote:
>
>> Thanks Ralph after giving full path of hello it runs.
>> But it run only on one rank
>> Hello World! from process 0 out of 1 on node56.beowulf.cluster
>
> Just to check things out, I would do:
>
> mpirun --display-allocation --display-map -np 4 ....
>
> That should show you the allocation and where OMPI is putting the procs.
>
>> there also a error
>> >cat my-script.sh.e43
>> stty: standard input: Invalid argument
>
> Not really sure here - must be an error in the script itself.
>
>>
>>
>>
>> On 9 June 2010 16:46, Ralph Castain <rhc_at_[hidden]
>> <mailto:rhc_at_[hidden]>> wrote:
>>
>> You need to include the path to "hello" unless it sits in your
>> PATH environment!
>>
>> On Jun 9, 2010, at 9:37 AM, Govind wrote:
>>
>>>
>>> #!/bin/sh
>>> /usr/lib64/openmpi/1.4-gcc/bin/mpirun hello
>>>
>>>
>>> On 9 June 2010 16:21, David Zhang <solarbikedz_at_[hidden]
>>> <mailto:solarbikedz_at_[hidden]>> wrote:
>>>
>>> what does your my-script.sh looks like?
>>>
>>> On Wed, Jun 9, 2010 at 8:17 AM, Govind <govind.rhul_at_[hidden]
>>> <mailto:govind.rhul_at_[hidden]>> wrote:
>>>
>>> Hi,
>>>
>>> I have installed following openMPI packge on worker node
>>> from repo
>>> openmpi-libs-1.4-4.el5.x86_64
>>> openmpi-1.4-4.el5.x86_64
>>> mpitests-openmpi-3.0-2.el5.x86_64
>>> mpi-selector-1.0.2-1.el5.noarch
>>>
>>> torque-client-2.3.6-2cri.el5.x86_64
>>> torque-2.3.6-2cri.el5.x86_64
>>> torque-mom-2.3.6-2cri.el5.x86_64
>>>
>>>
>>> Having some problem on running MPI jobs with torque
>>> qsub -q long -l nodes=4 my-script.sh
>>> 42.pbs1 <http://42.pbs1.pp.rhul.ac.uk/>
>>>
>>> cat my-script.sh.e41
>>> stty: standard input: Invalid argument
>>> --------------------------------------------------------------------------
>>> mpirun was unable to launch the specified application as
>>> it could not find an executable:
>>>
>>> Executable: hello
>>> Node: node56.beowulf.cluster
>>>
>>> while attempting to start process rank 0.
>>> ==================================
>>>
>>> I could run the binary directly on the node without any
>>> problem.
>>> mpiexec -n 4 hello
>>> Hello World! from process 2 out of 4 on
>>> node56.beowulf.cluster
>>> Hello World! from process 0 out of 4 on
>>> node56.beowulf.cluster
>>> Hello World! from process 3 out of 4 on
>>> node56.beowulf.cluster
>>> Hello World! from process 1 out of 4 on
>>> node56.beowulf.cluster
>>>
>>> Could you please advise, if I missing anything here.
>>>
>>>
>>> Regards
>>> Govind
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden] <mailto:users_at_[hidden]>
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>>
>>>
>>> --
>>> David Zhang
>>> University of California, San Diego
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden] <mailto:users_at_[hidden]>
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden] <mailto:users_at_[hidden]>
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden] <mailto:users_at_[hidden]>
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden] <mailto:users_at_[hidden]>
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users