Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] modified hostfile does not work with openmpi1.7rc8
From: tmishima_at_[hidden]
Date: 2013-03-19 20:08:43


Hi Gus,

Thank you for your comments. I understand your advice.
Our script used to be --npernode type as well.

As I told before, our cluster consists of nodes having 4, 8,
and 32 cores, although it used to be homogeneous at the
starting time. Furthermore, since performance of each core
is almost same, a mixed use of nodes with different number
of cores is possible, just like #PBS -l nodes=1:ppn=32+4:ppn=8.

--npernode type is not applicable to such a mixed use.
That's why I'd like to continue to use modified hostfile.

By the way, the problem I reported to Jeff yesterday
was that openmpi-1.7 with torque is something wrong,
because it caused error against such a simple case as
shown below, which surprised me. Now, the problem is not
limited to modified hostfile, I guess.

#PBS -l nodes=4:ppn=8
mpirun -np 8 ./my_program
(OMP_NUM_THREADS=4)

Regards,
Tetsuya Mishima

> Hi Tetsuya
>
> Your script that edits $PBS_NODEFILE into a separate hostfile
> is very similar to some that I used here for
> hybrid OpenMP+MPI programs on older versions of OMPI.
> I haven't tried this in 1.6.X,
> but it looks like you did and it works also.
> I haven't tried 1.7 either.
> Since we run production machines,
> I try to stick to the stable versions of OMPI (even numbered:
> 1.6.X, 1.4.X, 1.2.X).
>
> I believe you can get the same effect even if you
> don't edit your $PBS_NODEFILE and let OMPI use it as is.
> Say, if you choose carefully the values in your
> #PBS -l nodes=?:ppn=?
> of your
> $OMP_NUM_THREADS
> and use an mpiexec with --npernode or --cpus-per-proc.
>
> For instance, for twelve MPI processes, with two threads each,
> on nodes with eight cores each, I would try
> (but I haven't tried!):
>
> #PBS -l nodes=3:ppn=8
>
> export $OMP_NUM_THREADS=2
>
> mpiexec -np 12 -npernode 4
>
> or perhaps more tightly:
>
> mpiexec -np 12 --report-bindings --bind-to-core --cpus-per-proc 2
>
> I hope this helps,
> Gus Correa
>
>
>
> On 03/19/2013 03:12 PM, tmishima_at_[hidden] wrote:
> >
> >
> > Hi Reuti and Gus,
> >
> > Thank you for your comments.
> >
> > Our cluster is a little bit heterogeneous, which has nodes with 4, 8,
32
> > cores.
> > I used 8-core nodes for "-l nodes=4:ppn=8" and 4-core nodes for "-l
> > nodes=2:ppn=4".
> > (strictly speaking, Torque picked up proper nodes.)
> >
> > As I mentioned before, I usually use openmpi-1.6.x, which has no troble
> > against that kind
> > of use. I encountered the issue when I was evaluating openmpi-1.7 to
check
> > when we could
> > move on to it, although we have no positive reason to do that at this
> > moment.
> >
> > As Gus pointed out, I use a script file as shown below for a practical
use
> > of openmpi-1.6.x.
> >
> > #PBS -l nodes=2:ppn=32 # even "-l nodes=1:ppn=32+4:ppn=8" works fine
> > export OMP_NUM_THREADS=4
> > modify $PBS_NODEFILE pbs_hosts # 64 lines are condensed to 16 lines
here
> > mpirun -hostfile pbs_hosts -np 16 -cpus-per-proc 4 -report-bindings \
> > -x OMP_NUM_THREADS ./my_program # 32-core node has 8 numanodes, 8-core
> > node has 2 numanodes
> >
> > It works well under the combination of openmpi-1.6.x and Torque. The
> > problem is just
> > openmpi-1.7's behavior.
> >
> > Regards,
> > Tetsuya Mishima
> >
> >> Hi Tetsuya Mishima
> >>
> >> Mpiexec offers you a number of possibilities that you could try:
> >> --bynode,
> >> --pernode,
> >> --npernode,
> >> --bysocket,
> >> --bycore,
> >> --cpus-per-proc,
> >> --cpus-per-rank,
> >> --rankfile
> >> and more.
> >>
> >> Most likely one or more of them will fit your needs.
> >>
> >> There are also associated flags to bind processes to cores,
> >> to sockets, etc, to report the bindings, and so on.
> >>
> >> Check the mpiexec man page for details.
> >>
> >> Nevertheless, I am surprised that modifying the
> >> $PBS_NODEFILE doesn't work for you in OMPI 1.7.
> >> I have done this many times in older versions of OMPI.
> >>
> >> Would it work for you to go back to the stable OMPI 1.6.X,
> >> or does it lack any special feature that you need?
> >>
> >> I hope this helps,
> >> Gus Correa
> >>
> >> On 03/19/2013 03:00 AM, tmishima_at_[hidden] wrote:
> >>>
> >>>
> >>> Hi Jeff,
> >>>
> >>> I didn't have much time to test this morning. So, I checked it again
> >>> now. Then, the trouble seems to depend on the number of nodes to use.
> >>>
> >>> This works(nodes< 4):
> >>> mpiexec -bynode -np 4 ./my_program&& #PBS -l nodes=2:ppn=8
> >>> (OMP_NUM_THREADS=4)
> >>>
> >>> This causes error(nodes>= 4):
> >>> mpiexec -bynode -np 8 ./my_program&& #PBS -l nodes=4:ppn=8
> >>> (OMP_NUM_THREADS=4)
> >>>
> >>> Regards,
> >>> Tetsuya Mishima
> >>>
> >>>> Oy; that's weird.
> >>>>
> >>>> I'm afraid we're going to have to wait for Ralph to answer why that
is
> >>> happening -- sorry!
> >>>>
> >>>>
> >>>> On Mar 18, 2013, at 4:45 PM,<tmishima_at_[hidden]> wrote:
> >>>>
> >>>>>
> >>>>>
> >>>>> Hi Correa and Jeff,
> >>>>>
> >>>>> Thank you for your comments. I quickly checked your suggestion.
> >>>>>
> >>>>> As a result, my simple example case worked well.
> >>>>> export OMP_NUM_THREADS=4
> >>>>> mpiexec -bynode -np 2 ./my_program&& #PBS -l nodes=2:ppn=4
> >>>>>
> >>>>> But, practical case that more than 1 process was allocated to a
node
> >>> like
> >>>>> below did not work.
> >>>>> export OMP_NUM_THREADS=4
> >>>>> mpiexec -bynode -np 4 ./my_program&& #PBS -l nodes=2:ppn=8
> >>>>>
> >>>>> The error message is as follows:
> >>>>> [node08.cluster:11946] [[30666,0],3] ORTE_ERROR_LOG: A message is
> >>>>> attempting to be sent to a process whose contact infor
> >>>>> mation is unknown in file rml_oob_send.c at line 316
> >>>>> [node08.cluster:11946] [[30666,0],3] unable to find address for
> >>>>> [[30666,0],1]
> >>>>> [node08.cluster:11946] [[30666,0],3] ORTE_ERROR_LOG: A message is
> >>>>> attempting to be sent to a process whose contact infor
> >>>>> mation is unknown in file base/grpcomm_base_rollup.c at line 123
> >>>>>
> >>>>> Here is our openmpi configuration:
> >>>>> ./configure \
> >>>>> --prefix=/home/mishima/opt/mpi/openmpi-1.7rc8-pgi12.9 \
> >>>>> --with-tm \
> >>>>> --with-verbs \
> >>>>> --disable-ipv6 \
> >>>>> CC=pgcc CFLAGS="-fast -tp k8-64e" \
> >>>>> CXX=pgCC CXXFLAGS="-fast -tp k8-64e" \
> >>>>> F77=pgfortran FFLAGS="-fast -tp k8-64e" \
> >>>>> FC=pgfortran FCFLAGS="-fast -tp k8-64e"
> >>>>>
> >>>>> Regards,
> >>>>> Tetsuya Mishima
> >>>>>
> >>>>>> On Mar 17, 2013, at 10:55 PM, Gustavo
Correa<gus_at_[hidden]>
> >>>>> wrote:
> >>>>>>
> >>>>>>> In your example, have you tried not to modify the node file,
> >>>>>>> launch two mpi processes with mpiexec, and request a "-bynode"
> >>>>> distribution of processes:
> >>>>>>>
> >>>>>>> mpiexec -bynode -np 2 ./my_program
> >>>>>>
> >>>>>> This should work in 1.7, too (I use these kinds of options with
> > SLURM
> >>> all
> >>>>> the time).
> >>>>>>
> >>>>>> However, we should probably verify that the hostfile functionality
> > in
> >>>>> batch jobs hasn't been broken in 1.7, too, because I'm pretty sure
> > that
> >>>>> what you described should work. However, Ralph, our
> >>>>>> run-time guy, is on vacation this week. There might be a delay in
> >>>>> checking into this.
> >>>>>>
> >>>>>> --
> >>>>>> Jeff Squyres
> >>>>>> jsquyres_at_[hidden]
> >>>>>> For corporate legal information go to:
> >>>>> http://www.cisco.com/web/about/doing_business/legal/cri/
> >>>>>>
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> users mailing list
> >>>>>> users_at_[hidden]
> >>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> users mailing list
> >>>>> users_at_[hidden]
> >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>
> >>>>
> >>>> --
> >>>> Jeff Squyres
> >>>> jsquyres_at_[hidden]
> >>>> For corporate legal information go to:
> >>> http://www.cisco.com/web/about/doing_business/legal/cri/
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> users mailing list
> >>>> users_at_[hidden]
> >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>
> >>>
> >>> _______________________________________________
> >>> users mailing list
> >>> users_at_[hidden]
> >>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>
> >> _______________________________________________
> >> users mailing list
> >> users_at_[hidden]
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>