Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Tim Prins (tprins_at_[hidden])
Date: 2007-06-26 14:40:47


Hi Jeff,

If you submit a batch script, there is no need to do a salloc.

See the Open MPI FAQ for details on how to run on SLURM:
http://www.open-mpi.org/faq/?category=slurm

Hope this helps.

Tim

On Wednesday 27 June 2007 14:21, Jeff Pummill wrote:
> Hey Jeff,
>
> Finally got my test nodes back and was looking at the info you sent. On
> the SLURM page, it states the following:
>
> *Open MPI* <http://www.open-mpi.org/> relies upon SLURM to allocate
> resources for the job and then mpirun to initiate the tasks. When using
> salloc command, mpirun's -nolocal option is recommended. For example:
>
> $ salloc -n4 sh # allocates 4 processors and spawns shell for job
>
> > mpirun -np 4 -nolocal a.out
> > exit # exits shell spawned by initial salloc command
>
> You are saying that I need to use the slurm salloc, then pass SLURM a
> script? Or could I just add it all into the script? Fro eaample:
>
> #!/bin/sh
> salloc -n4
> mpirun my_mpi_application
>
> Then, run with srun -b myscript.sh
>
>
> Jeff F. Pummill
> Senior Linux Cluster Administrator
> University of Arkansas
> Fayetteville, Arkansas 72701
> (479) 575 - 4590
> http://hpc.uark.edu
>
> "A supercomputer is a device for turning compute-bound
> problems into I/O-bound problems." -Seymour Cray
>
> Jeff Squyres wrote:
> > Ick; I'm surprised that we don't have this info on the FAQ. I'll try
> > to rectify that shortly.
> >
> > How are you launching your jobs through SLURM? OMPI currently does
> > not support the "srun -n X my_mpi_application" model for launching
> > MPI jobs. You must either use the -A option to srun (i.e., get an
> > interactive SLURM allocation) or use the -b option (submit a script
> > that runs on the first node in the allocation). Your script can be
> > quite short:
> >
> > #!/bin/sh
> > mpirun my_mpi_application
> >
> > Note that OMPI will automatically figure out how many cpu's are in
> > your SLURM allocation, so you don't need to specify "-np X". Hence,
> > you can run the same script without modification no matter how many
> > cpus/nodes you get from SLURM.
> >
> > It's on the long-term plan to get "srun -n X my_mpi_application"
> > model to work; it just hasn't bubbled up high enough in the priority
> > stack yet... :-\
> >
> > On Jun 20, 2007, at 1:59 PM, Jeff Pummill wrote:
> >> Just started working with OpenMPI / SLURM combo this morning. I can
> >> successfully launch this job from the command line and it runs to
> >> completion, but when launching from SLURM they hang.
> >>
> >> They appear to just sit with no load apparent on the compute nodes
> >> even though SLURM indicates they are running...
> >>
> >> [jpummil_at_trillion ~]$ sinfo -l
> >> Wed Jun 20 12:32:29 2007
> >> PARTITION AVAIL TIMELIMIT JOB_SIZE ROOT SHARE GROUPS
> >> NODES STATE NODELIST
> >> debug* up infinite 1-infinite no no all
> >> 8 allocated compute-1-[1-8]
> >> debug* up infinite 1-infinite no no all
> >> 1 idle compute-1-0
> >>
> >> [jpummil_at_trillion ~]$ squeue -l
> >> Wed Jun 20 12:32:20 2007
> >> JOBID PARTITION NAME USER STATE TIME TIMELIMIT
> >> NODES NODELIST(REASON)
> >> 79 debug mpirun jpummil RUNNING 5:27
> >> UNLIMITED 2 compute-1-[1-2]
> >> 78 debug mpirun jpummil RUNNING 5:58
> >> UNLIMITED 2 compute-1-[3-4]
> >> 77 debug mpirun jpummil RUNNING 7:00
> >> UNLIMITED 2 compute-1-[5-6]
> >> 74 debug mpirun jpummil RUNNING 11:39
> >> UNLIMITED 2 compute-1-[7-8]
> >>
> >> Are there any known issues of this nature involving OpenMPI and SLURM?
> >>
> >> Thanks!
> >>
> >> Jeff F. Pummill
> >>
> >> _______________________________________________
> >> users mailing list
> >> users_at_[hidden]
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users