Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] trouble using openmpi under slurm
From: Ralph Castain (rhc_at_[hidden])
Date: 2010-07-07 14:09:36


Ah, if only it were that simple. Slurm is a very difficult beast to interface with, and I have yet to find a single, reliable marker across the various slurm releases to detect options we cannot support.

On Jul 7, 2010, at 11:59 AM, David Roundy wrote:

> On Wed, Jul 7, 2010 at 10:26 AM, Ralph Castain <rhc_at_[hidden]> wrote:
>> I'm afraid the bottom line is that OMPI simply doesn't support core-level allocations. I tried it on a slurm machine available to me, using our devel trunk as well as 1.4, with the same results.
>>
>> Not sure why you are trying to run that way, but I'm afraid you can't do it with OMPI.
>
> Hmmm. I'm still trying to figure out how to configure slurm properly.
> I want it to be able to put one single-process job per core on each
> machine. I just now figured out that there is a slurm "-n" option. I
> had previously only been aware of the "-N" and "-c" options, and the
> latter was closer match. It looks like everything works fine with the
> "-n" option.
>
> However, wouldn't it be a good idea to avoid crashing when "-c 2" is
> used, e.g. by ignoring the environment variable SLURM_CPUS_PER_TASK?
> It seems like this would be an important feature to be able to use if
> one wanted to run mpi with multiple threads per node (as I've been
> known to do in the past).
>
> In my trouble shooting, I came up with the following script, which can
> reliably crash mpirun (when run without slurm, but obviously
> pretending to be running under slurm). :(
>
> #!/bin/sh
> set -ev
> export SLURM_JOBID=137
> export SLURM_TASKS_PER_NODE=1
> export SLURM_NNODES=1
> export SLURM_CPUS_PER_TASK=2
> export SLURM_NODELIST=localhost
> mpirun --display-devel-map echo hello world
> echo it worked
>
> David
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users