Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] modified hostfile does not work with openmpi1.7rc8
From: tmishima_at_[hidden]
Date: 2013-03-20 19:39:23


Hi Ralph,

Here is a result of rerun with --display-allocation.
I set OMP_NUM_THREADS=1 to make the problem clear.

Regards,
Tetsuya Mishima

P.S. As far as I checked, these 2 cases are OK(no problem).
(1)mpirun -v -np $NPROCS-x OMP_NUM_THREADS --display-allocation
~/Ducom/testbed/mPre m02-ld
(2)mpirun -v -x OMP_NUM_THREADS --display-allocation ~/Ducom/testbed/mPre
m02-ld

Script File:

#!/bin/sh
#PBS -A tmishima
#PBS -N Ducom-run
#PBS -j oe
#PBS -l nodes=2:ppn=4
export OMP_NUM_THREADS=1
cd $PBS_O_WORKDIR
cp $PBS_NODEFILE pbs_hosts
NPROCS=`wc -l < pbs_hosts`
mpirun -v -np $NPROCS -hostfile pbs_hosts -x OMP_NUM_THREADS
--display-allocation ~/Ducom/testbed/mPre m02-ld

Output:
--------------------------------------------------------------------------
A deprecated MCA parameter value was specified in an MCA parameter
file. Deprecated MCA parameters should be avoided; they may disappear
in future releases.

  Deprecated parameter: orte_rsh_agent
--------------------------------------------------------------------------

====================== ALLOCATED NODES ======================

 Data for node: node06 Num slots: 4 Max slots: 0
 Data for node: node05 Num slots: 4 Max slots: 0

=================================================================
--------------------------------------------------------------------------
A hostfile was provided that contains at least one node not
present in the allocation:

  hostfile: pbs_hosts
  node: node06

If you are operating in a resource-managed environment, then only
nodes that are in the allocation can be used in the hostfile. You
may find relative node syntax to be a useful alternative to
specifying absolute node names see the orte_hosts man page for
further information.
--------------------------------------------------------------------------

> I've submitted a patch to fix the Torque launch issue - just some
leftover garbage that existed at the time of the 1.7.0 branch and didn't
get removed.
>
> For the hostfile issue, I'm stumped as I can't see how the problem would
come about. Could you please rerun your original test and add
"--display-allocation" to your cmd line? Let's see if it is
> correctly finding the original allocation.
>
> Thanks
> Ralph
>
> On Mar 19, 2013, at 5:08 PM, tmishima_at_[hidden] wrote:
>
> >
> >
> > Hi Gus,
> >
> > Thank you for your comments. I understand your advice.
> > Our script used to be --npernode type as well.
> >
> > As I told before, our cluster consists of nodes having 4, 8,
> > and 32 cores, although it used to be homogeneous at the
> > starting time. Furthermore, since performance of each core
> > is almost same, a mixed use of nodes with different number
> > of cores is possible, just like #PBS -l nodes=1:ppn=32+4:ppn=8.
> >
> > --npernode type is not applicable to such a mixed use.
> > That's why I'd like to continue to use modified hostfile.
> >
> > By the way, the problem I reported to Jeff yesterday
> > was that openmpi-1.7 with torque is something wrong,
> > because it caused error against such a simple case as
> > shown below, which surprised me. Now, the problem is not
> > limited to modified hostfile, I guess.
> >
> > #PBS -l nodes=4:ppn=8
> > mpirun -np 8 ./my_program
> > (OMP_NUM_THREADS=4)
> >
> > Regards,
> > Tetsuya Mishima
> >
> >> Hi Tetsuya
> >>
> >> Your script that edits $PBS_NODEFILE into a separate hostfile
> >> is very similar to some that I used here for
> >> hybrid OpenMP+MPI programs on older versions of OMPI.
> >> I haven't tried this in 1.6.X,
> >> but it looks like you did and it works also.
> >> I haven't tried 1.7 either.
> >> Since we run production machines,
> >> I try to stick to the stable versions of OMPI (even numbered:
> >> 1.6.X, 1.4.X, 1.2.X).
> >>
> >> I believe you can get the same effect even if you
> >> don't edit your $PBS_NODEFILE and let OMPI use it as is.
> >> Say, if you choose carefully the values in your
> >> #PBS -l nodes=?:ppn=?
> >> of your
> >> $OMP_NUM_THREADS
> >> and use an mpiexec with --npernode or --cpus-per-proc.
> >>
> >> For instance, for twelve MPI processes, with two threads each,
> >> on nodes with eight cores each, I would try
> >> (but I haven't tried!):
> >>
> >> #PBS -l nodes=3:ppn=8
> >>
> >> export $OMP_NUM_THREADS=2
> >>
> >> mpiexec -np 12 -npernode 4
> >>
> >> or perhaps more tightly:
> >>
> >> mpiexec -np 12 --report-bindings --bind-to-core --cpus-per-proc 2
> >>
> >> I hope this helps,
> >> Gus Correa
> >>
> >>
> >>
> >> On 03/19/2013 03:12 PM, tmishima_at_[hidden] wrote:
> >>>
> >>>
> >>> Hi Reuti and Gus,
> >>>
> >>> Thank you for your comments.
> >>>
> >>> Our cluster is a little bit heterogeneous, which has nodes with 4, 8,
> > 32
> >>> cores.
> >>> I used 8-core nodes for "-l nodes=4:ppn=8" and 4-core nodes for "-l
> >>> nodes=2:ppn=4".
> >>> (strictly speaking, Torque picked up proper nodes.)
> >>>
> >>> As I mentioned before, I usually use openmpi-1.6.x, which has no
troble
> >>> against that kind
> >>> of use. I encountered the issue when I was evaluating openmpi-1.7 to
> > check
> >>> when we could
> >>> move on to it, although we have no positive reason to do that at this
> >>> moment.
> >>>
> >>> As Gus pointed out, I use a script file as shown below for a
practical
> > use
> >>> of openmpi-1.6.x.
> >>>
> >>> #PBS -l nodes=2:ppn=32 # even "-l nodes=1:ppn=32+4:ppn=8" works fine
> >>> export OMP_NUM_THREADS=4
> >>> modify $PBS_NODEFILE pbs_hosts # 64 lines are condensed to 16 lines
> > here
> >>> mpirun -hostfile pbs_hosts -np 16 -cpus-per-proc 4 -report-bindings \
> >>> -x OMP_NUM_THREADS ./my_program # 32-core node has 8 numanodes,
8-core
> >>> node has 2 numanodes
> >>>
> >>> It works well under the combination of openmpi-1.6.x and Torque. The
> >>> problem is just
> >>> openmpi-1.7's behavior.
> >>>
> >>> Regards,
> >>> Tetsuya Mishima
> >>>
> >>>> Hi Tetsuya Mishima
> >>>>
> >>>> Mpiexec offers you a number of possibilities that you could try:
> >>>> --bynode,
> >>>> --pernode,
> >>>> --npernode,
> >>>> --bysocket,
> >>>> --bycore,
> >>>> --cpus-per-proc,
> >>>> --cpus-per-rank,
> >>>> --rankfile
> >>>> and more.
> >>>>
> >>>> Most likely one or more of them will fit your needs.
> >>>>
> >>>> There are also associated flags to bind processes to cores,
> >>>> to sockets, etc, to report the bindings, and so on.
> >>>>
> >>>> Check the mpiexec man page for details.
> >>>>
> >>>> Nevertheless, I am surprised that modifying the
> >>>> $PBS_NODEFILE doesn't work for you in OMPI 1.7.
> >>>> I have done this many times in older versions of OMPI.
> >>>>
> >>>> Would it work for you to go back to the stable OMPI 1.6.X,
> >>>> or does it lack any special feature that you need?
> >>>>
> >>>> I hope this helps,
> >>>> Gus Correa
> >>>>
> >>>> On 03/19/2013 03:00 AM, tmishima_at_[hidden] wrote:
> >>>>>
> >>>>>
> >>>>> Hi Jeff,
> >>>>>
> >>>>> I didn't have much time to test this morning. So, I checked it
again
> >>>>> now. Then, the trouble seems to depend on the number of nodes to
use.
> >>>>>
> >>>>> This works(nodes< 4):
> >>>>> mpiexec -bynode -np 4 ./my_program&& #PBS -l nodes=2:ppn=8
> >>>>> (OMP_NUM_THREADS=4)
> >>>>>
> >>>>> This causes error(nodes>= 4):
> >>>>> mpiexec -bynode -np 8 ./my_program&& #PBS -l nodes=4:ppn=8
> >>>>> (OMP_NUM_THREADS=4)
> >>>>>
> >>>>> Regards,
> >>>>> Tetsuya Mishima
> >>>>>
> >>>>>> Oy; that's weird.
> >>>>>>
> >>>>>> I'm afraid we're going to have to wait for Ralph to answer why
that
> > is
> >>>>> happening -- sorry!
> >>>>>>
> >>>>>>
> >>>>>> On Mar 18, 2013, at 4:45 PM,<tmishima_at_[hidden]> wrote:
> >>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> Hi Correa and Jeff,
> >>>>>>>
> >>>>>>> Thank you for your comments. I quickly checked your suggestion.
> >>>>>>>
> >>>>>>> As a result, my simple example case worked well.
> >>>>>>> export OMP_NUM_THREADS=4
> >>>>>>> mpiexec -bynode -np 2 ./my_program&& #PBS -l nodes=2:ppn=4
> >>>>>>>
> >>>>>>> But, practical case that more than 1 process was allocated to a
> > node
> >>>>> like
> >>>>>>> below did not work.
> >>>>>>> export OMP_NUM_THREADS=4
> >>>>>>> mpiexec -bynode -np 4 ./my_program&& #PBS -l nodes=2:ppn=8
> >>>>>>>
> >>>>>>> The error message is as follows:
> >>>>>>> [node08.cluster:11946] [[30666,0],3] ORTE_ERROR_LOG: A message is
> >>>>>>> attempting to be sent to a process whose contact infor
> >>>>>>> mation is unknown in file rml_oob_send.c at line 316
> >>>>>>> [node08.cluster:11946] [[30666,0],3] unable to find address for
> >>>>>>> [[30666,0],1]
> >>>>>>> [node08.cluster:11946] [[30666,0],3] ORTE_ERROR_LOG: A message is
> >>>>>>> attempting to be sent to a process whose contact infor
> >>>>>>> mation is unknown in file base/grpcomm_base_rollup.c at line 123
> >>>>>>>
> >>>>>>> Here is our openmpi configuration:
> >>>>>>> ./configure \
> >>>>>>> --prefix=/home/mishima/opt/mpi/openmpi-1.7rc8-pgi12.9 \
> >>>>>>> --with-tm \
> >>>>>>> --with-verbs \
> >>>>>>> --disable-ipv6 \
> >>>>>>> CC=pgcc CFLAGS="-fast -tp k8-64e" \
> >>>>>>> CXX=pgCC CXXFLAGS="-fast -tp k8-64e" \
> >>>>>>> F77=pgfortran FFLAGS="-fast -tp k8-64e" \
> >>>>>>> FC=pgfortran FCFLAGS="-fast -tp k8-64e"
> >>>>>>>
> >>>>>>> Regards,
> >>>>>>> Tetsuya Mishima
> >>>>>>>
> >>>>>>>> On Mar 17, 2013, at 10:55 PM, Gustavo
> > Correa<gus_at_[hidden]>
> >>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> In your example, have you tried not to modify the node file,
> >>>>>>>>> launch two mpi processes with mpiexec, and request a "-bynode"
> >>>>>>> distribution of processes:
> >>>>>>>>>
> >>>>>>>>> mpiexec -bynode -np 2 ./my_program
> >>>>>>>>
> >>>>>>>> This should work in 1.7, too (I use these kinds of options with
> >>> SLURM
> >>>>> all
> >>>>>>> the time).
> >>>>>>>>
> >>>>>>>> However, we should probably verify that the hostfile
functionality
> >>> in
> >>>>>>> batch jobs hasn't been broken in 1.7, too, because I'm pretty
sure
> >>> that
> >>>>>>> what you described should work. However, Ralph, our
> >>>>>>>> run-time guy, is on vacation this week. There might be a delay
in
> >>>>>>> checking into this.
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> Jeff Squyres
> >>>>>>>> jsquyres_at_[hidden]
> >>>>>>>> For corporate legal information go to:
> >>>>>>> http://www.cisco.com/web/about/doing_business/legal/cri/
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> _______________________________________________
> >>>>>>>> users mailing list
> >>>>>>>> users_at_[hidden]
> >>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>>>>>
> >>>>>>>
> >>>>>>> _______________________________________________
> >>>>>>> users mailing list
> >>>>>>> users_at_[hidden]
> >>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Jeff Squyres
> >>>>>> jsquyres_at_[hidden]
> >>>>>> For corporate legal information go to:
> >>>>> http://www.cisco.com/web/about/doing_business/legal/cri/
> >>>>>>
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> users mailing list
> >>>>>> users_at_[hidden]
> >>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> users mailing list
> >>>>> users_at_[hidden]
> >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>
> >>>> _______________________________________________
> >>>> users mailing list
> >>>> users_at_[hidden]
> >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>
> >>>
> >>> _______________________________________________
> >>> users mailing list
> >>> users_at_[hidden]
> >>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>
> >> _______________________________________________
> >> users mailing list
> >> users_at_[hidden]
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>