Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] openmpi-1.7.5a1r30692 and slurm problems
From: Adrian Reber (adrian_at_[hidden])
Date: 2014-02-12 10:50:30


On Wed, Feb 12, 2014 at 07:47:53AM -0800, Ralph Castain wrote:
> >
> > $ msub -I -l nodes=3:ppn=8
> > salloc: Job is in held state, pending scheduler release
> > salloc: Pending job allocation 131828
> > salloc: job 131828 queued and waiting for resources
> > salloc: job 131828 has been allocated resources
> > salloc: Granted job allocation 131828
> > sh-4.1$ echo $SLURM_TASKS_PER_NODE
> > 1
> > sh-4.1$ rpm -q slurm
> > slurm-2.6.5-1.el6.x86_64
> > sh-4.1$ echo $SLURM_NNODES
> > 1
> > sh-4.1$ echo $SLURM_JOB_NODELIST
> > xxxx[107-108,176]
> > sh-4.1$ echo $SLURM_JOB_CPUS_PER_NODE
> > 8(x3)
> > sh-4.1$ echo $SLURM_NODELIST
> > xxxx[107-108,176]
> > sh-4.1$ echo $SLURM_NPROCS
> > 1
> > sh-4.1$ echo $SLURM_NTASKS
> > 1
> > sh-4.1$ echo $SLURM_TASKS_PER_NODE
> > 1
> >
> > The information in *_NODELIST seems to make sense, but all the other
> > variables (PROCS, TASKS, NODES) report '1', which seems wrong.
>
> Indeed - and that's the problem. Slurm 2.6.5 is the most recent release, and my guess is that SchedMD once again has changed the @$!#%#@ meaning of their envars. Frankly, it is nearly impossible to track all the variants they have created over the years.
>
> Please check to see if someone did a little customizing on your end as sometimes people do that to Slurm. Could also be they did something in the Slurm config file that is causing the changed behavior.

I will try to find out if there is something special about the slurm
configuration and let you know as soon as possible.

                Adrian