Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Open MPI program cannot complete
From: Jack Bryan (dtustudy68_at_[hidden])
Date: 2010-10-25 13:35:25


thanks
I have to use #PBS to submit any jobs in my cluster. I cannot use command line to hang a job on my cluster.
this is my script: --------------------------------------#!/bin/bash#PBS -N jobname#PBS -l walltime=00:08:00,nodes=1#PBS -q queuenameCOMMAND=/mypath/myprogNCORES=5
cd $PBS_O_WORKDIRNODES=`cat $PBS_NODEFILE | wc -l`NPROC=$(( $NCORES * $NODES ))
mpirun -np $NPROC --mca btl self,sm,openib $COMMAND
-------------------------------------------

Where should I put the (gdb --batch -ex 'bt full' -ex 'info reg' -pid ZOMBIE_PID) in the script ? And how to get ZOMBIE_PID from the script ?
Any help is appreciated.
thanks
Oct. 25 2010
Date: Mon, 25 Oct 2010 19:24:35 +0200
From: jed_at_[hidden]
To: users_at_[hidden]
Subject: Re: [OMPI users] Open MPI program cannot complete

On Mon, Oct 25, 2010 at 19:07, Jack Bryan <dtustudy68_at_[hidden]> wrote:

I need to use #PBS parallel job script to submit a job on MPI cluster.
Is it not possible to reproduce locally? Most clusters have a way to submit an interactive job (which would let you start this thing and then inspect individual processes). Ashley's Padb suggestion will certainly be better in a non-interactive environment.
 Where should I put the (gdb --batch -ex 'bt full' -ex 'info reg' -pid ZOMBIE_PID) in the script ?

Is control returning to your script after rank 0 has exited? In that case, you can just put this on the next line.
How to get the ZOMBIE_PID ?

"ps" from the command line, or getpid() from C code.
Jed

_______________________________________________
users mailing list
users_at_[hidden]
http://www.open-mpi.org/mailman/listinfo.cgi/users