I don't know about the grid engine/ SGE.
However, in Torque, the batch/resource manager I use,
to allow oversubscription, you need to modify the batch server nodes file
and pretend the nodes have more cores than the physical ones.
[Something like 'node01 np=8' would change to 'node01 np=16' for instance.]
Maybe there is something similar in SGE.
We had bad results [program hanging or aborting]
when trying to run large programs which include PDE solvers
[climate models] and allowing oversubscription, even when a substantial amount
of RAM was idle.
That was a while ago, and I have not pursued the issue any further.
Maybe context switching among the [surplus of] processes is the problem.
Of course for 'hello, world' type of programs oversubscription works well.
Where is the threshold when oversubscription makes a program break down,
I'd guess only trial and error may tell.
I hope this helps,
On Dec 23, 2011, at 2:42 PM, Santosh Ansumali wrote:
> Dear All,
> We are running a PDE solver which is memory bound. Due to
> cache related issue, smaller number of grid point per core leads to
> better performance for this code. Thus, though available memory per
> core is more than 2 GB, we are able to good performance by using
> less than 1 GB per core.
> I want to know whether oversubscribing the cores can potentially
> improve performance of such a code. My thinking is that if I
> oversubscribe the cores, each thread will be using less than 1 GB so
> cache related problems will be less severe. Is this logic correct or
> due to cache conflict performance will deteriorate further?
> In case, over-subscription can help, how shall I modify
> submission file (using sun grid engine) to enable over-subscription of
> my current submission file is written as follows
> #$ -N first
> #$ -S /bin/bash
> #$ -cwd
> #$ -e $JOB_ID.$JOB_NAME.ERROR
> #$ -o $JOB_ID.$JOB_NAME.OUTPUT
> #$ -P faculty_prj
> #$ -p 0
> #$ -pe orte 8
> /opt/mpi/openmpi/1.3.3/gnu/bin/mpirun -np $NSLOTS ./test_vel.out
> Is it possible to allow over-subscription by modifying submission file
> itself? Or do I need to change hostfiles somehow?
> Thanks for your help!
> Best Regards
> Santosh Ansumali,
> Faculty Fellow,
> Engineering Mechanics Unit
> Jawaharlal Nehru Centre for Advanced Scientific Research (JNCASR)
> Jakkur, Bangalore-560 064, India
> Tel: + 91 80 22082938
> users mailing list