Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] modified hostfile does not work with openmpi1.7rc8
From: Ralph Castain (rhc_at_[hidden])
Date: 2013-03-20 21:38:01


Could you please apply the attached patch and try it again? If you haven't had time to configure with --enable-debug, that is fine - this will output regardless.

Thanks
Ralph


On Mar 20, 2013, at 4:59 PM, Ralph Castain <rhc_at_[hidden]> wrote:

> You obviously have some MCA params set somewhere:
>
>> --------------------------------------------------------------------------
>> A deprecated MCA parameter value was specified in an MCA parameter
>> file. Deprecated MCA parameters should be avoided; they may disappear
>> in future releases.
>>
>> Deprecated parameter: orte_rsh_agent
>> --------------------------------------------------------------------------
>
> Check your environment for anything with OMPI_MCA_xxx, and your default MCA parameter file to see what has been specified.
>
> The allocation looks okay - I'll have to look for other debug flags you can set. Meantime, can you please add --enable-debug to your configure cmd line and rebuild?
>
> Thanks
> Ralph
>
>
> On Mar 20, 2013, at 4:39 PM, tmishima_at_[hidden] wrote:
>
>>
>>
>> Hi Ralph,
>>
>> Here is a result of rerun with --display-allocation.
>> I set OMP_NUM_THREADS=1 to make the problem clear.
>>
>> Regards,
>> Tetsuya Mishima
>>
>> P.S. As far as I checked, these 2 cases are OK(no problem).
>> (1)mpirun -v -np $NPROCS-x OMP_NUM_THREADS --display-allocation
>> ~/Ducom/testbed/mPre m02-ld
>> (2)mpirun -v -x OMP_NUM_THREADS --display-allocation ~/Ducom/testbed/mPre
>> m02-ld
>>
>> Script File:
>>
>> #!/bin/sh
>> #PBS -A tmishima
>> #PBS -N Ducom-run
>> #PBS -j oe
>> #PBS -l nodes=2:ppn=4
>> export OMP_NUM_THREADS=1
>> cd $PBS_O_WORKDIR
>> cp $PBS_NODEFILE pbs_hosts
>> NPROCS=`wc -l < pbs_hosts`
>> mpirun -v -np $NPROCS -hostfile pbs_hosts -x OMP_NUM_THREADS
>> --display-allocation ~/Ducom/testbed/mPre m02-ld
>>
>> Output:
>> --------------------------------------------------------------------------
>> A deprecated MCA parameter value was specified in an MCA parameter
>> file. Deprecated MCA parameters should be avoided; they may disappear
>> in future releases.
>>
>> Deprecated parameter: orte_rsh_agent
>> --------------------------------------------------------------------------
>>
>> ====================== ALLOCATED NODES ======================
>>
>> Data for node: node06 Num slots: 4 Max slots: 0
>> Data for node: node05 Num slots: 4 Max slots: 0
>>
>> =================================================================
>> --------------------------------------------------------------------------
>> A hostfile was provided that contains at least one node not
>> present in the allocation:
>>
>> hostfile: pbs_hosts
>> node: node06
>>
>> If you are operating in a resource-managed environment, then only
>> nodes that are in the allocation can be used in the hostfile. You
>> may find relative node syntax to be a useful alternative to
>> specifying absolute node names see the orte_hosts man page for
>> further information.
>> --------------------------------------------------------------------------
>>
>>
>>> I've submitted a patch to fix the Torque launch issue - just some
>> leftover garbage that existed at the time of the 1.7.0 branch and didn't
>> get removed.
>>>
>>> For the hostfile issue, I'm stumped as I can't see how the problem would
>> come about. Could you please rerun your original test and add
>> "--display-allocation" to your cmd line? Let's see if it is
>>> correctly finding the original allocation.
>>>
>>> Thanks
>>> Ralph
>>>
>>> On Mar 19, 2013, at 5:08 PM, tmishima_at_[hidden] wrote:
>>>
>>>>
>>>>
>>>> Hi Gus,
>>>>
>>>> Thank you for your comments. I understand your advice.
>>>> Our script used to be --npernode type as well.
>>>>
>>>> As I told before, our cluster consists of nodes having 4, 8,
>>>> and 32 cores, although it used to be homogeneous at the
>>>> starting time. Furthermore, since performance of each core
>>>> is almost same, a mixed use of nodes with different number
>>>> of cores is possible, just like #PBS -l nodes=1:ppn=32+4:ppn=8.
>>>>
>>>> --npernode type is not applicable to such a mixed use.
>>>> That's why I'd like to continue to use modified hostfile.
>>>>
>>>> By the way, the problem I reported to Jeff yesterday
>>>> was that openmpi-1.7 with torque is something wrong,
>>>> because it caused error against such a simple case as
>>>> shown below, which surprised me. Now, the problem is not
>>>> limited to modified hostfile, I guess.
>>>>
>>>> #PBS -l nodes=4:ppn=8
>>>> mpirun -np 8 ./my_program
>>>> (OMP_NUM_THREADS=4)
>>>>
>>>> Regards,
>>>> Tetsuya Mishima
>>>>
>>>>> Hi Tetsuya
>>>>>
>>>>> Your script that edits $PBS_NODEFILE into a separate hostfile
>>>>> is very similar to some that I used here for
>>>>> hybrid OpenMP+MPI programs on older versions of OMPI.
>>>>> I haven't tried this in 1.6.X,
>>>>> but it looks like you did and it works also.
>>>>> I haven't tried 1.7 either.
>>>>> Since we run production machines,
>>>>> I try to stick to the stable versions of OMPI (even numbered:
>>>>> 1.6.X, 1.4.X, 1.2.X).
>>>>>
>>>>> I believe you can get the same effect even if you
>>>>> don't edit your $PBS_NODEFILE and let OMPI use it as is.
>>>>> Say, if you choose carefully the values in your
>>>>> #PBS -l nodes=?:ppn=?
>>>>> of your
>>>>> $OMP_NUM_THREADS
>>>>> and use an mpiexec with --npernode or --cpus-per-proc.
>>>>>
>>>>> For instance, for twelve MPI processes, with two threads each,
>>>>> on nodes with eight cores each, I would try
>>>>> (but I haven't tried!):
>>>>>
>>>>> #PBS -l nodes=3:ppn=8
>>>>>
>>>>> export $OMP_NUM_THREADS=2
>>>>>
>>>>> mpiexec -np 12 -npernode 4
>>>>>
>>>>> or perhaps more tightly:
>>>>>
>>>>> mpiexec -np 12 --report-bindings --bind-to-core --cpus-per-proc 2
>>>>>
>>>>> I hope this helps,
>>>>> Gus Correa
>>>>>
>>>>>
>>>>>
>>>>> On 03/19/2013 03:12 PM, tmishima_at_[hidden] wrote:
>>>>>>
>>>>>>
>>>>>> Hi Reuti and Gus,
>>>>>>
>>>>>> Thank you for your comments.
>>>>>>
>>>>>> Our cluster is a little bit heterogeneous, which has nodes with 4, 8,
>>>> 32
>>>>>> cores.
>>>>>> I used 8-core nodes for "-l nodes=4:ppn=8" and 4-core nodes for "-l
>>>>>> nodes=2:ppn=4".
>>>>>> (strictly speaking, Torque picked up proper nodes.)
>>>>>>
>>>>>> As I mentioned before, I usually use openmpi-1.6.x, which has no
>> troble
>>>>>> against that kind
>>>>>> of use. I encountered the issue when I was evaluating openmpi-1.7 to
>>>> check
>>>>>> when we could
>>>>>> move on to it, although we have no positive reason to do that at this
>>>>>> moment.
>>>>>>
>>>>>> As Gus pointed out, I use a script file as shown below for a
>> practical
>>>> use
>>>>>> of openmpi-1.6.x.
>>>>>>
>>>>>> #PBS -l nodes=2:ppn=32 # even "-l nodes=1:ppn=32+4:ppn=8" works fine
>>>>>> export OMP_NUM_THREADS=4
>>>>>> modify $PBS_NODEFILE pbs_hosts # 64 lines are condensed to 16 lines
>>>> here
>>>>>> mpirun -hostfile pbs_hosts -np 16 -cpus-per-proc 4 -report-bindings \
>>>>>> -x OMP_NUM_THREADS ./my_program # 32-core node has 8 numanodes,
>> 8-core
>>>>>> node has 2 numanodes
>>>>>>
>>>>>> It works well under the combination of openmpi-1.6.x and Torque. The
>>>>>> problem is just
>>>>>> openmpi-1.7's behavior.
>>>>>>
>>>>>> Regards,
>>>>>> Tetsuya Mishima
>>>>>>
>>>>>>> Hi Tetsuya Mishima
>>>>>>>
>>>>>>> Mpiexec offers you a number of possibilities that you could try:
>>>>>>> --bynode,
>>>>>>> --pernode,
>>>>>>> --npernode,
>>>>>>> --bysocket,
>>>>>>> --bycore,
>>>>>>> --cpus-per-proc,
>>>>>>> --cpus-per-rank,
>>>>>>> --rankfile
>>>>>>> and more.
>>>>>>>
>>>>>>> Most likely one or more of them will fit your needs.
>>>>>>>
>>>>>>> There are also associated flags to bind processes to cores,
>>>>>>> to sockets, etc, to report the bindings, and so on.
>>>>>>>
>>>>>>> Check the mpiexec man page for details.
>>>>>>>
>>>>>>> Nevertheless, I am surprised that modifying the
>>>>>>> $PBS_NODEFILE doesn't work for you in OMPI 1.7.
>>>>>>> I have done this many times in older versions of OMPI.
>>>>>>>
>>>>>>> Would it work for you to go back to the stable OMPI 1.6.X,
>>>>>>> or does it lack any special feature that you need?
>>>>>>>
>>>>>>> I hope this helps,
>>>>>>> Gus Correa
>>>>>>>
>>>>>>> On 03/19/2013 03:00 AM, tmishima_at_[hidden] wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> Hi Jeff,
>>>>>>>>
>>>>>>>> I didn't have much time to test this morning. So, I checked it
>> again
>>>>>>>> now. Then, the trouble seems to depend on the number of nodes to
>> use.
>>>>>>>>
>>>>>>>> This works(nodes< 4):
>>>>>>>> mpiexec -bynode -np 4 ./my_program&& #PBS -l nodes=2:ppn=8
>>>>>>>> (OMP_NUM_THREADS=4)
>>>>>>>>
>>>>>>>> This causes error(nodes>= 4):
>>>>>>>> mpiexec -bynode -np 8 ./my_program&& #PBS -l nodes=4:ppn=8
>>>>>>>> (OMP_NUM_THREADS=4)
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Tetsuya Mishima
>>>>>>>>
>>>>>>>>> Oy; that's weird.
>>>>>>>>>
>>>>>>>>> I'm afraid we're going to have to wait for Ralph to answer why
>> that
>>>> is
>>>>>>>> happening -- sorry!
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mar 18, 2013, at 4:45 PM,<tmishima_at_[hidden]> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Hi Correa and Jeff,
>>>>>>>>>>
>>>>>>>>>> Thank you for your comments. I quickly checked your suggestion.
>>>>>>>>>>
>>>>>>>>>> As a result, my simple example case worked well.
>>>>>>>>>> export OMP_NUM_THREADS=4
>>>>>>>>>> mpiexec -bynode -np 2 ./my_program&& #PBS -l nodes=2:ppn=4
>>>>>>>>>>
>>>>>>>>>> But, practical case that more than 1 process was allocated to a
>>>> node
>>>>>>>> like
>>>>>>>>>> below did not work.
>>>>>>>>>> export OMP_NUM_THREADS=4
>>>>>>>>>> mpiexec -bynode -np 4 ./my_program&& #PBS -l nodes=2:ppn=8
>>>>>>>>>>
>>>>>>>>>> The error message is as follows:
>>>>>>>>>> [node08.cluster:11946] [[30666,0],3] ORTE_ERROR_LOG: A message is
>>>>>>>>>> attempting to be sent to a process whose contact infor
>>>>>>>>>> mation is unknown in file rml_oob_send.c at line 316
>>>>>>>>>> [node08.cluster:11946] [[30666,0],3] unable to find address for
>>>>>>>>>> [[30666,0],1]
>>>>>>>>>> [node08.cluster:11946] [[30666,0],3] ORTE_ERROR_LOG: A message is
>>>>>>>>>> attempting to be sent to a process whose contact infor
>>>>>>>>>> mation is unknown in file base/grpcomm_base_rollup.c at line 123
>>>>>>>>>>
>>>>>>>>>> Here is our openmpi configuration:
>>>>>>>>>> ./configure \
>>>>>>>>>> --prefix=/home/mishima/opt/mpi/openmpi-1.7rc8-pgi12.9 \
>>>>>>>>>> --with-tm \
>>>>>>>>>> --with-verbs \
>>>>>>>>>> --disable-ipv6 \
>>>>>>>>>> CC=pgcc CFLAGS="-fast -tp k8-64e" \
>>>>>>>>>> CXX=pgCC CXXFLAGS="-fast -tp k8-64e" \
>>>>>>>>>> F77=pgfortran FFLAGS="-fast -tp k8-64e" \
>>>>>>>>>> FC=pgfortran FCFLAGS="-fast -tp k8-64e"
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Tetsuya Mishima
>>>>>>>>>>
>>>>>>>>>>> On Mar 17, 2013, at 10:55 PM, Gustavo
>>>> Correa<gus_at_[hidden]>
>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> In your example, have you tried not to modify the node file,
>>>>>>>>>>>> launch two mpi processes with mpiexec, and request a "-bynode"
>>>>>>>>>> distribution of processes:
>>>>>>>>>>>>
>>>>>>>>>>>> mpiexec -bynode -np 2 ./my_program
>>>>>>>>>>>
>>>>>>>>>>> This should work in 1.7, too (I use these kinds of options with
>>>>>> SLURM
>>>>>>>> all
>>>>>>>>>> the time).
>>>>>>>>>>>
>>>>>>>>>>> However, we should probably verify that the hostfile
>> functionality
>>>>>> in
>>>>>>>>>> batch jobs hasn't been broken in 1.7, too, because I'm pretty
>> sure
>>>>>> that
>>>>>>>>>> what you described should work. However, Ralph, our
>>>>>>>>>>> run-time guy, is on vacation this week. There might be a delay
>> in
>>>>>>>>>> checking into this.
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Jeff Squyres
>>>>>>>>>>> jsquyres_at_[hidden]
>>>>>>>>>>> For corporate legal information go to:
>>>>>>>>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> users mailing list
>>>>>>>>>>> users_at_[hidden]
>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> users mailing list
>>>>>>>>>> users_at_[hidden]
>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Jeff Squyres
>>>>>>>>> jsquyres_at_[hidden]
>>>>>>>>> For corporate legal information go to:
>>>>>>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> users_at_[hidden]
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> users_at_[hidden]
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> users_at_[hidden]
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> users_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


  • application/octet-stream attachment: user.diff