Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2007-01-19 14:35:02


I think the SLURM code in Open MPI is making an assumption that is
failing in your case: we assume that your nodes will have a specific
naming convention:

mycluster.example.com --> head node
mycluster01.example.com --> cluster node 1
mycluster02.example.com --> cluster node 2
...etc.

OMPI is therefore parsing the SLURM environment and not correctly
groking the "master,wolf1" string because, to be honest, I didn't
even know that SLURM supported that scenario. I.e., I thought SLURM
required the naming convention I listed above. In hindsight, that's
a pretty silly assumption, but to be fair, you're the first user that
ever came to us with this problem (i.e., we use pretty much the same
string parsing in LAM/MPI, which has had SLURM support for several
years). Oops!

We can fix this, but I don't know if it'll make the v1.2 cutoff or
not. :-\

Thanks for bringing this to our attention!

On Jan 19, 2007, at 1:50 PM, Robert Bicknell wrote:

> Thanks for your response. The program that I have been using for
> testing purposes is a simple hello:
>
>
> #include <stdio.h>
>
> #include <mpi.h>
>
>
> #include <sys/time.h>
> #include <sys/resource.h>
> #include <unistd.h>
> #include <stdio.h>
> main(int argc, char *argv)
> {
> char name[BUFSIZ];
> int length;
> int rank;
> struct rlimit rlim;
> FILE *output;
>
> MPI_Init(&argc, &argv);
> MPI_Get_processor_name(name, &length);
> MPI_Comm_rank(MPI_COMM_WORLD, &rank);
> rank = 0;
> MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>
> // while(1) {
> printf("%s: hello world from rank %d\n", name, rank);
> sleep(1);
> // }
> MPI_Finalize();
> }
>
> If I run this program not in a slurm environment I get the following
>
> mpirun -np 4 -mca btl tcp,self -host wolf1,master ./hello
>
> master: hello world from rank 1
> wolf1: hello world from rank 0
> wolf1: hello world from rank 2
> master: hello world from rank 3
>
> This is exactly what I expect. Now if I create a slurm environment
> using the following:
>
> srun -n 4 -A
>
> The output of printenv|grep SLRUM gives me:
>
> SLURM_NODELIST=master,wolf1
> SLURM_SRUN_COMM_PORT=58929
> SLURM_MEM_BIND_TYPE=
> SLURM_CPU_BIND_VERBOSE=quiet
> SLURM_MEM_BIND_LIST=
> SLURM_CPU_BIND_LIST=
> SLURM_NNODES=2
> SLURM_JOBID=66135
> SLURM_TASKS_PER_NODE=2(x2)
> SLURM_SRUN_COMM_HOST=master
> SLURM_CPU_BIND_TYPE=
> SLURM_MEM_BIND_VERBOSE=quiet
> SLURM_NPROCS=4
>
> This seems to indicate that both master and wolf1 have been
> allocated and that each node should run 2 tasks, which is correct
> since both master and wolf1 are dual processor machines.
>
> Now if I run:
>
> mpirun -np 4 -mca btl tcp,self ./hello
>
> The output is:
>
> master: hello world from rank 1
> master: hello world from rank 2
> master: hello world from rank 3
> master: hello world from rank 0
>
>
> All four processes are running on master and none on wolf1.
>
> If I try the following and specify the hosts. I get the following
> error message.
>
> mpirun -np 4 -host wolf1,master -mca btl tcp,self ./hello
>
> ----------------------------------------------------------------------
> ----
> Some of the requested hosts are not included in the current
> allocation for the
> application:
> ./hello
> The requested hosts were:
> wolf1,master
>
> Verify that you have mapped the allocated resources properly using the
> --host specification.
> ----------------------------------------------------------------------
> ----
> [master:28022] [0,0,0] ORTE_ERROR_LOG: Out of resource in file
> rmgr_urm.c at line 377
> [master:28022] mpirun: spawn failed with errno=-2
>
>
> I'm at a loss to figure out how to get this working correctly. Any
> help would be greatly appreciated.
>
> Bob
>
> On 1/19/07, Ralph Castain <rhc_at_[hidden]> wrote: Open MPI and SLURM
> should work together just fine right out-of-the-box. The
> typical command progression is:
>
> srun -n x -A
> mpirun -n y .....
>
>
> If you are doing those commands and still see everything running on
> the head
> node, then two things could be happening:
>
> (a) you really aren't getting an allocation from slurm. Perhaps you
> don't
> have slurm setup correctly and aren't actually seeing the
> allocation in your
> environment. Do a "printenv | grep SLURM" and see if you find the
> following
> variables:
> SLURM_NPROCS=8
> SLURM_CPU_BIND_VERBOSE=quiet
> SLURM_CPU_BIND_TYPE=
> SLURM_CPU_BIND_LIST=
> SLURM_MEM_BIND_VERBOSE=quiet
> SLURM_MEM_BIND_TYPE=
> SLURM_MEM_BIND_LIST=
> SLURM_JOBID=47225
> SLURM_NNODES=2
> SLURM_NODELIST=odin[013-014]
> SLURM_TASKS_PER_NODE=4(x2)
> SLURM_SRUN_COMM_PORT=43206
> SLURM_SRUN_COMM_HOST=odin
>
> Obviously, the values will be different, but we really need the
> TASKS_PER_NODE and NODELIST ones to be there
>
> (b) the master node is being included in your nodelist and you aren't
> running enough mpi processes to need more nodes (i.e., the number
> of slots
> on the master node is greater than or equal to the num procs you
> requested).
> You can force Open MPI to not run on your master node by including
> "--nolocal" on your command line.
>
> Of course, if the master node is the only thing on the nodelist,
> this will
> cause mpirun to abort as there is nothing else for us to use.
>
> Hope that helps
> Ralph
>
>
> On 1/18/07 11:03 PM, "Robert Bicknell" <robbicknell_at_[hidden]> wrote:
>
> > I'm trying to get slurm and openmpi to work together on a debian,
> two
> > node cluster. Slurm and openmpi seem to work fine seperately,
> but when
> > I try to run a mpi program in a slurm allocation, all the
> processes get
> > run on the master node, and not distributed to the second node.
> What am
> > I doing wrong?
> >
> > Bob
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems