On 11/16/2010 04:26 AM, Chris Jewell wrote:
Hi all,

On 11/15/2010 02:11 PM, Reuti wrote: 
Just to give my understanding of the problem: 

Sorry, I am still trying to grok all your email as what the problem you 
are trying to solve. So is the issue is trying to have two jobs having 
processes on the same node be able to bind there processes on different 
resources. Like core 1 for the first job and core 2 and 3 for the 2nd job? 

You can't get 2 slots on a machine, as it's limited by the core count to one here, so such a slot allocation shouldn't occur at all. 
So to clarify, the current -binding <binding_strategy>:<binding_amount> allocates binding_amount cores to each sge_shepherd process associated with a job_id.  There appears to be only one sge_shepherd process per job_id per execution node, so all child processes run on these allocated cores.  This is irrespective of the number of slots allocated to the node.  
I believe the above is correct.
I agree with Reuti that the binding_amount parameter should be a maximum number of bound cores per node, with the actual number determined by the number of slots allocated per node.  FWIW, an alternative approach might be to have another binding_type ('slot', say) that automatically allocated one core per slot.
That might be correct, I've put in a question to someone who should know.
Of course, a complex situation might arise if a user submits a combined MPI/multithreaded job, but then I guess we're into the realm of setting allocation_rule.
Yes, that would get ugly.
Is it going to be worth looking at creating a patch for this?  I don't know much of the internals of SGE -- would it be hard work to do?  I've not that much time to dedicate towards it, but I could put some effort in if necessary...

Is the patch you're wanting is for a "slot" binding_type?

Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle - Performance Technologies
95 Network Drive, Burlington, MA 01803
Email terry.dontje@oracle.com