On 11/22/13 11:15 AM, Ralph Castain
Am 22.11.2013 um 18:56 schrieb Jason Gans:
On 11/22/13 10:47 AM, Reuti wrote:
Since there are only 8 Torque nodes in the cluster, I
submitted the job by requesting 8 nodes, i.e. "qsub -I -l
Am 22.11.2013 um 17:32 schrieb Gans, Jason D:
I would like to run an instance
of my application on every *core* of a small cluster.
I am using Torque 2.5.12 to run jobs on the cluster.
The cluster in question is a heterogeneous collection
of machines that are all past their prime.
Specifically, the number of cores ranges from 2-8.
Here is the Torque "nodes" file:
You submitted the job itself by requesting 24 cores for
When I use openmpi-1.6.3, I can oversubscribe nodes
but the tasks are allocated to nodes without regard to
the number of cores on each node (specified by the
"np=xx" in the nodes file). For example, when I run
"mpirun -np 24 hostname", mpirun places three
instances of "hostname" on each node, despite the fact
that some nodes only have two processors and some have
No, AFAICT it's necessary to request there 24 too. To
investigate it further it would also be good to copy the
$PBS_NODEFILE in your job script for later inspection to
your home directory. I.e. whether you are getting the
correct values there already.
Not really - we take the number of slots on each node and add
Question: is that a copy/paste of the actual PBS_NODEFILE? It
doesn't look right to me - there is supposed to be one node
entry for each slot. In other words, it should have looked like
That is what I expected -- however, the $PBS_NODEFILE lists each
node just once.
Been a while since I used Torque, but I suspect Reuti is right - you have to ask for 24 slots. Sounds like Torque is only assigning you one slot/node.
users mailing list