On Fri, 21 Mar 2008 17:41:28 -0400
"Sacerdoti, Federico" <Federico.Sacerdoti_at_[hidden]> wrote:
> Ralph, we wrote a launcher for mvapich that uses srun to launch but
> keeps tight control of where processes are started. The way we did it
> was to force srun to launch a single process on a particular node.
>
> The launcher calls many of these:
> srun --jobid $JOBID -N 1 -n 1 -w host005 CMD ARGS
My work-around will be an mpirun which looks something like this:
#!/bin/bash
hostfile=`mktemp` || exit 1
srun /bin/hostname | sort | uniq -c | sed -e 's/ *\([0-9]\+\) \+\(.\+\)/\2 slots=\1/' > $hostfile
/usr/bin/mpirun.openmpi-1.2.4 --hostfile $hostfile $@
rm $hostfile
i.e. we are collecting all nodes with "srun /bin/hostname", sort and
count them and bring them into the format of a hostfile:
node001 slots=4
node002 slots=2
...
But that's definitely not the API to slurm, Ralph was talking about :-)
Werner
|