Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Can not run a parallel job on all the nodes in the cluster
From: Reuti (reuti_at_[hidden])
Date: 2012-03-28 11:21:39


Hi,

Am 28.03.2012 um 16:55 schrieb Hameed Alzahrani:

> I ran hello program which return the host name when I run it using
> mpirun -np 8 hello
> all the 8 answer returned from the same machine
> when I run it using
> mpirun -np 8 --host host1,host2,host3 hello
> I got answers from all the machines but it is not from all processors because I have 8 processors host1=4, host2=2, host3=2 the answer was 3 from host1, 3 from host2 and 2 from host3.

If you want to specify the number of slots you can put it in a hostfile (otherwise a round robin assignment is just used). I'm not aware that it can be specified on the command line with different values for each machine:

host1 slots=4
host2 slots=2
host3 slots=2

-- Reuti

>
> Regards,
>
> > From: reuti_at_[hidden]
> > Date: Wed, 28 Mar 2012 16:42:21 +0200
> > To: users_at_[hidden]
> > Subject: Re: [OMPI users] Can not run a parallel job on all the nodes in the cluster
> >
> > Hi,
> >
> > Am 28.03.2012 um 16:30 schrieb Hameed Alzahrani:
> >
> > > Hi,
> > >
> > > I mean the node that I run mpirun command from. I use condor as a scheduler but I need to benchmark the cluster either from condor or directly from open MPI.
> >
> > I can't say anything regarding the Condor integration of Open MPI, but starting it directly by mpirun and supplying a valid number of ranks and hostfile should start some processes on other machines as requested. Can you run a plain mpihello first and output rank and hostname? Do you have ssh access to all the machines in questions? You have a shared home directory with the applications?
> >
> > -- Reuti
> >
> >
> > > when I ran mpirun from a machine and checking the memory status for the three machines that I have it appear that the memory usage increased just in the same machine.
> > >
> > > Regards,
> > >
> > > > From: reuti_at_[hidden]
> > > > Date: Wed, 28 Mar 2012 15:12:17 +0200
> > > > To: users_at_[hidden]
> > > > Subject: Re: [OMPI users] Can not run a parallel job on all the nodes in the cluster
> > > >
> > > > Hi,
> > > >
> > > > Am 27.03.2012 um 23:46 schrieb Hameed Alzahrani:
> > > >
> > > > > When I run any parallel job I get the answer just from the submitting node
> > > >
> > > > what do you mean by submitting node: you use a queuing system - which one?
> > > >
> > > > -- Reuti
> > > >
> > > >
> > > > > even when I tried to benchmark the cluster using LINPACK but it look like the job just working on the submitting node is there a way to make openMPI send the job equally to all the nodes depending on the number of processor in the current mode even if I specify that the job should use 8 processor it look like openMPI use the submitting node 4 processors instead of using the other processors. I tried also --host but it does not work correctly in benchmarking the cluster so does any one use openMPI in benchmarking a cluster or does any one knows how to make openMPI divids the parallel job equally to every processor on the cluster.
> > > > >
> > > > > Regards,
> > > > > _______________________________________________
> > > > > users mailing list
> > > > > users_at_[hidden]
> > > > > http://www.open-mpi.org/mailman/listinfo.cgi/users
> > > >
> > > >
> > > > _______________________________________________
> > > > users mailing list
> > > > users_at_[hidden]
> > > > http://www.open-mpi.org/mailman/listinfo.cgi/users
> > > _______________________________________________
> > > users mailing list
> > > users_at_[hidden]
> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users