Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] problem w sge 6.2 & openmpi
From: Eli Morris (emorris_at_[hidden])
Date: 2009-08-05 18:11:53


Hi Rolf,

Thanks for answering!

Eli

Here is qstat -t, when I launch it with 8 processors. It looks to me
like it is actually using compute node 8. The mpirun job was submitted
on the head node 'nimbus'

Tried swapping out hostname for mpirun in the job script. For both 8
and 16 processors, I got about the same output. Only diff was which
node it used:

[emorris_at_nimbus ~/test]$ more mpi-ring.qsub.o254
compute-0-14.local

[emorris_at_nimbus ~/test]$ qsub -pe orte 8 mpi-ring.qsub
Your job 255 ("mpi-ring.qsub") has been submitted
[emorris_at_nimbus ~/test]$ qstat -t
job-ID prior name user state submit/start at
queue master ja-task-ID task-ID state
cpu mem io stat failed
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
     255 0.55500 mpi-ring.q emorris r 08/05/2009 15:03:12 all.q_at_compute-0-8.local
         MASTER
                                                                   all.q_at_compute-0-8.local
         SLAVE
                                                                   all.q_at_compute-0-8.local
         SLAVE
                                                                   all.q_at_compute-0-8.local
         SLAVE
                                                                   all.q_at_compute-0-8.local
         SLAVE
                                                                   all.q_at_compute-0-8.local
         SLAVE
                                                                   all.q_at_compute-0-8.local
         SLAVE
                                                                   all.q_at_compute-0-8.local
         SLAVE
                                                                   all.q_at_compute-0-8.local
         SLAVE

Here is entire job script:

[emorris_at_nimbus ~/test]$ more mpi-ring.qsub
#!/bin/bash
#
#$ -cwd
#$ -j y
#$ -S /bin/bash
#
#hostname
/opt/openmpi/bin/mpirun --debug-daemons --mca plm_base_verbose 40 --
mca plm_rsh_agent ssh -np $NSLOTS $HOME/test/mpi-ring

[root_at_nimbus gridengine]# qconf -sp orte
pe_name orte
slots 9999
user_lists NONE
xuser_lists NONE
start_proc_args /bin/true
stop_proc_args /bin/true
allocation_rule $fill_up
control_slaves TRUE
job_is_first_task FALSE
urgency_slots min
accounting_summary TRUE