Hi all,
On 11/15/2010 02:11 PM, Reuti wrote:
Just to give my understanding of the problem:
Sorry, I am still trying to grok all your email as what the problem you
are trying to solve. So is the issue is trying to have two jobs having
processes on the same node be able to bind there processes on different
resources. Like core 1 for the first job and core 2 and 3 for the 2nd job?
--td
You can't get 2 slots on a machine, as it's limited by the core count to one here, so such a slot allocation shouldn't occur at all.
So to clarify, the current -binding <binding_strategy>:<binding_amount> allocates binding_amount cores to each sge_shepherd process associated with a job_id. There appears to be only one sge_shepherd process per job_id per execution node, so all child processes run on these allocated cores. This is irrespective of the number of slots allocated to the node.
I believe the above is correct.
I agree with Reuti that the binding_amount parameter should be a maximum number of bound cores per node, with the actual number determined by the number of slots allocated per node. FWIW, an alternative approach might be to have another binding_type ('slot', say) that automatically allocated one core per slot.
That might be correct, I've put in a question to someone who should
know.