Alas, I'm sorry to hear that! I had hoped (assumed?) that the slurm
team would be hand-in-glove with the OMPI team in making sure the
interface between the two is smooth. :(
On Wed, Jul 7, 2010 at 11:09 AM, Ralph Castain <rhc_at_[hidden]> wrote:
> Ah, if only it were that simple. Slurm is a very difficult beast to interface with, and I have yet to find a single, reliable marker across the various slurm releases to detect options we cannot support.
> On Jul 7, 2010, at 11:59 AM, David Roundy wrote:
>> On Wed, Jul 7, 2010 at 10:26 AM, Ralph Castain <rhc_at_[hidden]> wrote:
>>> I'm afraid the bottom line is that OMPI simply doesn't support core-level allocations. I tried it on a slurm machine available to me, using our devel trunk as well as 1.4, with the same results.
>>> Not sure why you are trying to run that way, but I'm afraid you can't do it with OMPI.
>> Hmmm. Â I'm still trying to figure out how to configure slurm properly.
>> I want it to be able to put one single-process job per core on each
>> machine. Â I just now figured out that there is a slurm "-n" option. Â I
>> had previously only been aware of the "-N" and "-c" options, and the
>> latter was closer match. Â It looks like everything works fine with the
>> "-n" option.
>> However, wouldn't it be a good idea to avoid crashing when "-c 2" is
>> used, e.g. by ignoring the environment variable SLURM_CPUS_PER_TASK?
>> It seems like this would be an important feature to be able to use if
>> one wanted to run mpi with multiple threads per node (as I've been
>> known to do in the past).
>> In my trouble shooting, I came up with the following script, which can
>> reliably crash mpirun (when run without slurm, but obviously
>> pretending to be running under slurm). Â :(
>> set -ev
>> export SLURM_JOBID=137
>> export SLURM_TASKS_PER_NODE=1
>> export SLURM_NNODES=1
>> export SLURM_CPUS_PER_TASK=2
>> export SLURM_NODELIST=localhost
>> mpirun --display-devel-map echo hello world
>> echo it worked
>> users mailing list
> users mailing list