Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] trouble using openmpi under slurm
From: Ralph Castain (rhc_at_[hidden])
Date: 2010-07-07 14:37:54

No....afraid not. Things work pretty well, but there are places where things just don't mesh. Sub-node allocation in particular is an issue as it implies binding, and slurm and ompi have conflicting methods.

It all can get worked out, but we have limited time and nobody cares enough to put in the effort. Slurm just isn't used enough to make it worthwhile (too small an audience).

On Jul 7, 2010, at 12:32 PM, David Roundy wrote:

> Alas, I'm sorry to hear that! I had hoped (assumed?) that the slurm
> team would be hand-in-glove with the OMPI team in making sure the
> interface between the two is smooth. :(
> David
> On Wed, Jul 7, 2010 at 11:09 AM, Ralph Castain <rhc_at_[hidden]> wrote:
>> Ah, if only it were that simple. Slurm is a very difficult beast to interface with, and I have yet to find a single, reliable marker across the various slurm releases to detect options we cannot support.
>> On Jul 7, 2010, at 11:59 AM, David Roundy wrote:
>>> On Wed, Jul 7, 2010 at 10:26 AM, Ralph Castain <rhc_at_[hidden]> wrote:
>>>> I'm afraid the bottom line is that OMPI simply doesn't support core-level allocations. I tried it on a slurm machine available to me, using our devel trunk as well as 1.4, with the same results.
>>>> Not sure why you are trying to run that way, but I'm afraid you can't do it with OMPI.
>>> Hmmm. I'm still trying to figure out how to configure slurm properly.
>>> I want it to be able to put one single-process job per core on each
>>> machine. I just now figured out that there is a slurm "-n" option. I
>>> had previously only been aware of the "-N" and "-c" options, and the
>>> latter was closer match. It looks like everything works fine with the
>>> "-n" option.
>>> However, wouldn't it be a good idea to avoid crashing when "-c 2" is
>>> used, e.g. by ignoring the environment variable SLURM_CPUS_PER_TASK?
>>> It seems like this would be an important feature to be able to use if
>>> one wanted to run mpi with multiple threads per node (as I've been
>>> known to do in the past).
>>> In my trouble shooting, I came up with the following script, which can
>>> reliably crash mpirun (when run without slurm, but obviously
>>> pretending to be running under slurm). :(
>>> #!/bin/sh
>>> set -ev
>>> export SLURM_JOBID=137
>>> export SLURM_NNODES=1
>>> export SLURM_CPUS_PER_TASK=2
>>> export SLURM_NODELIST=localhost
>>> mpirun --display-devel-map echo hello world
>>> echo it worked
>>> David
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
> --
> David Roundy
> _______________________________________________
> users mailing list
> users_at_[hidden]