I need to use #PBS parallel job script to submit a job on MPI cluster.
Is it not possible to reproduce locally? Most clusters have a way to submit an interactive job (which would let you start this thing and then inspect individual processes). Ashley's Padb suggestion will certainly be better in a non-interactive environment.
Where should I put the (gdb --batch -ex 'bt full' -ex 'info reg' -pid ZOMBIE_PID) in the script ?
Is control returning to your script after rank 0 has exited? In that case, you can just put this on the next line.
How to get the ZOMBIE_PID ?
"ps" from the command line, or getpid() from C code.
users mailing list