Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] trouble using openmpi under slurm
From: David Roundy (roundyd_at_[hidden])
Date: 2010-07-07 14:32:22


Alas, I'm sorry to hear that! I had hoped (assumed?) that the slurm
team would be hand-in-glove with the OMPI team in making sure the
interface between the two is smooth. :(

David

On Wed, Jul 7, 2010 at 11:09 AM, Ralph Castain <rhc_at_[hidden]> wrote:
> Ah, if only it were that simple. Slurm is a very difficult beast to interface with, and I have yet to find a single, reliable marker across the various slurm releases to detect options we cannot support.
>
>
> On Jul 7, 2010, at 11:59 AM, David Roundy wrote:
>
>> On Wed, Jul 7, 2010 at 10:26 AM, Ralph Castain <rhc_at_[hidden]> wrote:
>>> I'm afraid the bottom line is that OMPI simply doesn't support core-level allocations. I tried it on a slurm machine available to me, using our devel trunk as well as 1.4, with the same results.
>>>
>>> Not sure why you are trying to run that way, but I'm afraid you can't do it with OMPI.
>>
>> Hmmm.  I'm still trying to figure out how to configure slurm properly.
>> I want it to be able to put one single-process job per core on each
>> machine.  I just now figured out that there is a slurm "-n" option.  I
>> had previously only been aware of the "-N" and "-c" options, and the
>> latter was closer match.  It looks like everything works fine with the
>> "-n" option.
>>
>> However, wouldn't it be a good idea to avoid crashing when "-c 2" is
>> used, e.g. by ignoring the environment variable SLURM_CPUS_PER_TASK?
>> It seems like this would be an important feature to be able to use if
>> one wanted to run mpi with multiple threads per node (as I've been
>> known to do in the past).
>>
>> In my trouble shooting, I came up with the following script, which can
>> reliably crash mpirun (when run without slurm, but obviously
>> pretending to be running under slurm).  :(
>>
>> #!/bin/sh
>> set -ev
>> export SLURM_JOBID=137
>> export SLURM_TASKS_PER_NODE=1
>> export SLURM_NNODES=1
>> export SLURM_CPUS_PER_TASK=2
>> export SLURM_NODELIST=localhost
>> mpirun --display-devel-map echo hello world
>> echo it worked
>>
>> David
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

-- 
David Roundy