Ah, if only it were that simple. Slurm is a very difficult beast to interface with, and I have yet to find a single, reliable marker across the various slurm releases to detect options we cannot support.
On Jul 7, 2010, at 11:59 AM, David Roundy wrote:
> On Wed, Jul 7, 2010 at 10:26 AM, Ralph Castain <rhc_at_[hidden]> wrote:
>> I'm afraid the bottom line is that OMPI simply doesn't support core-level allocations. I tried it on a slurm machine available to me, using our devel trunk as well as 1.4, with the same results.
>> Not sure why you are trying to run that way, but I'm afraid you can't do it with OMPI.
> Hmmm. I'm still trying to figure out how to configure slurm properly.
> I want it to be able to put one single-process job per core on each
> machine. I just now figured out that there is a slurm "-n" option. I
> had previously only been aware of the "-N" and "-c" options, and the
> latter was closer match. It looks like everything works fine with the
> "-n" option.
> However, wouldn't it be a good idea to avoid crashing when "-c 2" is
> used, e.g. by ignoring the environment variable SLURM_CPUS_PER_TASK?
> It seems like this would be an important feature to be able to use if
> one wanted to run mpi with multiple threads per node (as I've been
> known to do in the past).
> In my trouble shooting, I came up with the following script, which can
> reliably crash mpirun (when run without slurm, but obviously
> pretending to be running under slurm). :(
> set -ev
> export SLURM_JOBID=137
> export SLURM_TASKS_PER_NODE=1
> export SLURM_NNODES=1
> export SLURM_CPUS_PER_TASK=2
> export SLURM_NODELIST=localhost
> mpirun --display-devel-map echo hello world
> echo it worked
> users mailing list