Hi Ted,
Does the "default mpirun command" implementation match the build
environment for quest_ompi.x ?
ie., what mpi implementation (mpich, LAM/MPI, OPENMPI, or other) was
quest_ompi.x compiled/linked with? and does that match the result of
"which mpirun"? You might try running a job using your PBS system that
simply executes the which mpirun command and see what you get.
HTH,
Mac McCalla
Houston
________________________________
From: users-bounces_at_[hidden] [mailto:users-bounces_at_[hidden]] On
Behalf Of Ted Yu
Sent: 06 February 2009 10:02
To: Open MPI Users; Ralph Castain
Subject: Re: [OMPI users] Global Communicator
Just to make sure, because I have to use open mpi for this program:
I'm using the default mpirun command.
When I type "man mpirun", these are the first few lines:
MPIRUN(1) OPEN MPI COMMANDS
MPIRUN(1)
NAME
orterun, mpirun, mpiexec - Execute serial and parallel jobs in
Open
MPI.
Note: mpirun, mpiexec, and orterun are all exact synonyms for
each
other. Using any of the names will result in exactly identical
behav-
ior.
Ted
--- On Fri, 2/6/09, Ralph Castain <rhc_at_[hidden]> wrote:
From: Ralph Castain <rhc_at_[hidden]>
Subject: Re: [OMPI users] Global Communicator
To: tedhyu_at_[hidden], "Open MPI Users"
<users_at_[hidden]>
Date: Friday, February 6, 2009, 7:55 AM
Hi Ted
From what I can tell, you are not using Open MPI, but mpich's
mpirun. You might want to ask for help on their mailing list.
Ralph
On Feb 6, 2009, at 8:49 AM, Ted Yu wrote:
Thanx for the reply.
I guess I should go back a step: I had used the openmpi version on my
system which is simply:
"mpirun -machinefile $PBS_NODEFILE -np $NPROCS ${CODE}
>/ul/tedhyu/fuelcell/HOH/test/HH.out"
This did not work because I was just getting a blank output.
I tried this older version because at least i was getting an output.
"/opt/mpich-1.2.5.10-ch_p4-gcc/bin/mpirun -machinefile $PBS_NODEFILE -np
$NPROCS ${CODE} >/ul/tedhyu/fuelcell/HOH/test/HH.out"
I think this older version is failing me for whatever reason. Do you
have any clue? I read somewhere that new versions of mpirun adds extra
commandline arguments to the end of the line. Therefore the newer
version of mpirun may be not be giving an output because it sees all
extra commandline arguments after my output file
>/ul/tedhyu/fuelcell/HOH/test/HH.out
This is where I'm reading that there are extra commandline arguments for
a version of mpirun:
https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/2008-February/02
9333.html
Again, I'm new at this, and I'm just guessing. Any ideas of where to
turn would be helpful!
Ted
--- On Thu, 2/5/09, doriankrause <doriankrause_at_[hidden]> wrote:
From: doriankrause <doriankrause_at_[hidden]>
Subject: Re: [OMPI users] Global Communicator
To: tedhyu_at_[hidden], "Open MPI Users"
<users_at_[hidden]>
Date: Thursday, February 5, 2009, 11:14 PM
Ted Yu wrote:
> I'm trying to run a job based on openmpi.
For some reason, the
program and the global communicator are not in sync and it reads
that there is
only one processors, whereas, there should be 2 or more. Any
advice on where to
look? Here is my PBS script. Thanx!
>
> PBS SCRIPT:
> #!/bin/sh
> ### Set the job name
> #PBS -N HH
> ### Declare myprogram non-rerunable
> #PBS -r n
> ### Combine standard error and standard out to one file.
> #PBS -j oe
> ### Have PBS mail you results
> #PBS -m ae
> #PBS -M tedhyu_at_[hidden]
> ### Set the queue name, given to you when you get a
reservation.
> #PBS -q workq
> ### Specify the number of cpus for your job. This example
will run on 32
cpus
>
### using 8 nodes with 4 processes per node.
> #PBS -l nodes=1:ppn=2,walltime=70:00:00
> # Switch to the working directory; by default PBS launches
processes from
your home directory.
>
# Jobs should only be run from /home, /project, or /work; PBS
returns
results via NFS.
> PBS_O_WORKDIR=/temp1/tedhyu/HH
> export
CODE=/project/source/seqquest/seqquest_source_v261j/hive_CentOS4.5_paral
lel/build_261j/quest_ompi.x
>
> echo Working directory is $PBS_O_WORKDIR
> mkdir -p $PBS_O_WORKDIR
> cd $PBS_O_WORKDIR
> rm -rf *
> cp /ul/tedhyu/fuelcell/HOH/test/HH.in ./lcao.in
> cp /ul/tedhyu/atom_pbe/* .
> echo Running on host `hostname`
> echo Time is `date`
> echo Directory is `pwd`
> echo This jobs runs on the following processors:
> echo `cat $PBS_NODEFILE`
> Number=`wc -l $PBS_NODEFILE | awk '{print $1}'`
>
> export Number
> echo
${Number}
> # Define number of processors
> NPROCS=`wc -l < $PBS_NODEFILE`
> # And the number or hosts
> NHOSTS=`cat $PBS_NODEFILE|uniq|wc -l`
> echo This job has
allocated $NPROCS cpus
> echo NHOSTS
> #mpirun -machinefile $PBS_NODEFILE ${CODE}
>/ul/tedhyu/fuelcell/HOH/test/HH.out
> #mpiexec -np 2 ${CODE} >/ul/tedhyu/fuelcell/HOH/test/HH.out
> /opt/mpich-1.2.5.10-ch_p4-gcc/bin/mpirun -machinefile
$PBS_NODEFILE -np
$NPROCS ${CODE} >/ul/tedhyu/fuelcell/HOH/test/HH.out
> cd ..
> rm -rf HH
>
>
>
Please note, that you are mixing Open MPI (API/Library) with
MPICH
(mpirun). This is a mistake I like to make, too. If you use
the ompi mpiexec program, it probably works.
Dorian
>
>
>
>
------------------------------------------------------------------------
>
>
_______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
users_at_[hidden]
http://www.open-mpi.org/mailman/listinfo.cgi/users
|