Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] openmpi-1.7.5a1r30692 and slurm problems
From: Ralph Castain (rhc_at_[hidden])
Date: 2014-02-12 10:47:53


On Feb 12, 2014, at 7:32 AM, Adrian Reber <adrian_at_[hidden]> wrote:

>
> $ msub -I -l nodes=3:ppn=8
> salloc: Job is in held state, pending scheduler release
> salloc: Pending job allocation 131828
> salloc: job 131828 queued and waiting for resources
> salloc: job 131828 has been allocated resources
> salloc: Granted job allocation 131828
> sh-4.1$ echo $SLURM_TASKS_PER_NODE
> 1
> sh-4.1$ rpm -q slurm
> slurm-2.6.5-1.el6.x86_64
> sh-4.1$ echo $SLURM_NNODES
> 1
> sh-4.1$ echo $SLURM_JOB_NODELIST
> xxxx[107-108,176]
> sh-4.1$ echo $SLURM_JOB_CPUS_PER_NODE
> 8(x3)
> sh-4.1$ echo $SLURM_NODELIST
> xxxx[107-108,176]
> sh-4.1$ echo $SLURM_NPROCS
> 1
> sh-4.1$ echo $SLURM_NTASKS
> 1
> sh-4.1$ echo $SLURM_TASKS_PER_NODE
> 1
>
> The information in *_NODELIST seems to make sense, but all the other
> variables (PROCS, TASKS, NODES) report '1', which seems wrong.

Indeed - and that's the problem. Slurm 2.6.5 is the most recent release, and my guess is that SchedMD once again has changed the @$!#%#@ meaning of their envars. Frankly, it is nearly impossible to track all the variants they have created over the years.

Please check to see if someone did a little customizing on your end as sometimes people do that to Slurm. Could also be they did something in the Slurm config file that is causing the changed behavior.

Meantime, I'll try to ponder a potential solution in case this really is the "latest" Slurm screwup.

>
>
> On Wed, Feb 12, 2014 at 07:19:54AM -0800, Ralph Castain wrote:
>> ...and your version of Slurm?
>>
>> On Feb 12, 2014, at 7:19 AM, Ralph Castain <rhc_at_[hidden]> wrote:
>>
>>> What is your SLURM_TASKS_PER_NODE?
>>>
>>> On Feb 12, 2014, at 6:58 AM, Adrian Reber <adrian_at_[hidden]> wrote:
>>>
>>>> No, the system has only a few MOAB_* variables and many SLURM_*
>>>> variables:
>>>>
>>>> $BASH $IFS $SECONDS $SLURM_PTY_PORT
>>>> $BASHOPTS $LINENO $SHELL $SLURM_PTY_WIN_COL
>>>> $BASHPID $LINES $SHELLOPTS $SLURM_PTY_WIN_ROW
>>>> $BASH_ALIASES $MACHTYPE $SHLVL $SLURM_SRUN_COMM_HOST
>>>> $BASH_ARGC $MAILCHECK $SLURMD_NODENAME $SLURM_SRUN_COMM_PORT
>>>> $BASH_ARGV $MOAB_CLASS $SLURM_CHECKPOINT_IMAGE_DIR $SLURM_STEPID
>>>> $BASH_CMDS $MOAB_GROUP $SLURM_CONF $SLURM_STEP_ID
>>>> $BASH_COMMAND $MOAB_JOBID $SLURM_CPUS_ON_NODE $SLURM_STEP_LAUNCHER_PORT
>>>> $BASH_LINENO $MOAB_NODECOUNT $SLURM_DISTRIBUTION $SLURM_STEP_NODELIST
>>>> $BASH_SOURCE $MOAB_PARTITION $SLURM_GTIDS $SLURM_STEP_NUM_NODES
>>>> $BASH_SUBSHELL $MOAB_PROCCOUNT $SLURM_JOBID $SLURM_STEP_NUM_TASKS
>>>> $BASH_VERSINFO $MOAB_SUBMITDIR $SLURM_JOB_CPUS_PER_NODE $SLURM_STEP_TASKS_PER_NODE
>>>> $BASH_VERSION $MOAB_USER $SLURM_JOB_ID $SLURM_SUBMIT_DIR
>>>> $COLUMNS $OPTERR $SLURM_JOB_NODELIST $SLURM_SUBMIT_HOST
>>>> $COMP_WORDBREAKS $OPTIND $SLURM_JOB_NUM_NODES $SLURM_TASKS_PER_NODE
>>>> $DIRSTACK $OSTYPE $SLURM_LAUNCH_NODE_IPADDR $SLURM_TASK_PID
>>>> $EUID $PATH $SLURM_LOCALID $SLURM_TOPOLOGY_ADDR
>>>> $GROUPS $POSIXLY_CORRECT $SLURM_NNODES $SLURM_TOPOLOGY_ADDR_PATTERN
>>>> $HISTCMD $PPID $SLURM_NODEID $SRUN_DEBUG
>>>> $HISTFILE $PS1 $SLURM_NODELIST $TERM
>>>> $HISTFILESIZE $PS2 $SLURM_NPROCS $TMPDIR
>>>> $HISTSIZE $PS4 $SLURM_NTASKS $UID
>>>> $HOSTNAME $PWD $SLURM_PRIO_PROCESS $_
>>>> $HOSTTYPE $RANDOM $SLURM_PROCID
>>>>
>>>>
>>>>
>>>> On Wed, Feb 12, 2014 at 06:12:45AM -0800, Ralph Castain wrote:
>>>>> Seems rather odd - since this is managed by Moab, you shouldn't be seeing SLURM envars at all. What you should see are PBS_* envars, including a PBS_NODEFILE that actually contains the allocation.
>>>>>
>>>>>
>>>>> On Feb 12, 2014, at 4:42 AM, Adrian Reber <adrian_at_[hidden]> wrote:
>>>>>
>>>>>> I tried the nightly snapshot (openmpi-1.7.5a1r30692.tar.gz) on a system
>>>>>> with slurm and moab. I requested an interactive session using:
>>>>>>
>>>>>> msub -I -l nodes=3:ppn=8
>>>>>>
>>>>>> and started a simple test case which fails:
>>>>>>
>>>>>> $ mpirun -np 2 ./mpi-test 1
>>>>>> --------------------------------------------------------------------------
>>>>>> There are not enough slots available in the system to satisfy the 2 slots
>>>>>> that were requested by the application:
>>>>>> ./mpi-test
>>>>>>
>>>>>> Either request fewer slots for your application, or make more slots available
>>>>>> for use.
>>>>>> --------------------------------------------------------------------------
>>>>>> srun: error: xxxx108: task 1: Exited with exit code 1
>>>>>> srun: Terminating job step 131823.4
>>>>>> srun: error: xxxx107: task 0: Exited with exit code 1
>>>>>> srun: Job step aborted
>>>>>> slurmd[xxxx108]: *** STEP 131823.4 KILLED AT 2014-02-12T13:30:32 WITH SIGNAL 9 ***
>>>>>>
>>>>>>
>>>>>> requesting only one core works:
>>>>>>
>>>>>> $ mpirun ./mpi-test 1
>>>>>> 4.4.7 20120313 (Red Hat 4.4.7-4):Process 0 on xxxx106 out of 1: 0.000000
>>>>>> 4.4.7 20120313 (Red Hat 4.4.7-4):Process 0 on xxxx106 out of 1: 0.000000
>>>>>>
>>>>>>
>>>>>> using openmpi-1.6.5 works with multiple cores:
>>>>>>
>>>>>> $ mpirun -np 24 ./mpi-test 2
>>>>>> 4.4.7 20120313 (Red Hat 4.4.7-4):Process 0 on xxxx106 out of 24: 0.000000
>>>>>> 4.4.7 20120313 (Red Hat 4.4.7-4):Process 12 on xxxx106 out of 24: 12.000000
>>>>>> 4.4.7 20120313 (Red Hat 4.4.7-4):Process 11 on xxxx108 out of 24: 11.000000
>>>>>> 4.4.7 20120313 (Red Hat 4.4.7-4):Process 18 on xxxx106 out of 24: 18.000000
>>>>>>
>>>>>> $ echo $SLURM_JOB_CPUS_PER_NODE
>>>>>> 8(x3)
>>>>>>
>>>>>> I never used slurm before so this could also be a user error on my side.
>>>>>> But as 1.6.5 works it seems something has changed and wanted to let
>>>>>> you know in case it was not intentionally.
>>>>>>
>>>>>> Adrian
>>>>>> _______________________________________________
>>>>>> devel mailing list
>>>>>> devel_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> devel_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>
>>>> Adrian
>>>>
>>>> --
>>>> Adrian Reber <adrian_at_[hidden]> http://lisas.de/~adrian/
>>>> "Let us all bask in television's warm glowing warming glow." -- Homer Simpson
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> Adrian
>
> --
> Adrian Reber <adrian_at_[hidden]> http://lisas.de/~adrian/
> There's got to be more to life than compile-and-go.
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel