We've gone around on this one a few times too. We finally settled on the current formula and confirmed it did what the slurm folks expected, so I'm somewhat loath to change it given that situation.
I suggest you take it up with the slurm folks to find out what behavior is expected when tasks_per_node and cpus_per_task are set. How many application processes are expected to be run on the node?
Part of the problem (as I recall) was that the meaning of tasks_per_node changed across a slurm release. At one time, it actually meant "cpus_per_node", and so you had to do the division to get the ppn correct. I'm not sure what it means today, but since Livermore writes slurm and the folks there seem to be happy with the way this behaves...<shrug>
Let me know what you find out.
On Feb 26, 2010, at 9:45 AM, Damien Guinier wrote:
> Hi Ralph,
> I find a minor bug on the MCA composent: ras slurm.
> This one have an incorrect comportement with the "X number of processors per task" feature.
> On the file orte/mca/ras/slurm/ras_slurm_module.c, line 356:
> - The node slot number is divide with "cpus_per_task" information,
> but "cpus_per_task" information is already include by the line 285.
> My proposition is to not divide the node slot number the seconde time.
> My patch is :
> diff -r ef9d639ab011 -r 8f62269014c2 orte/mca/ras/slurm/ras_slurm_module.c
> --- a/orte/mca/ras/slurm/ras_slurm_module.c Wed Jan 20 18:29:12 2010 +0100
> +++ b/orte/mca/ras/slurm/ras_slurm_module.c Thu Feb 25 15:59:41 2010 +0100
> @@ -353,7 +353,8 @@
> node->state = ORTE_NODE_STATE_UP;
> node->slots_inuse = 0;
> node->slots_max = 0;
> - node->slots = slots[i] / cpus_per_task;
> + node->slots = slots[i];
> opal_list_append(nodelist, &node->super);