Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] one more finding in openmpi-1.7.5a1
From: Ralph Castain (rhc_at_[hidden])
Date: 2014-02-14 10:21:57


<laugh> nothing that profound, I fear. Just the old man's brain continuing to itch over something while in that light sleep stage until the scratching gets enough that you realize the cause of the itch :-)

On Feb 14, 2014, at 2:59 AM, Reuti <reuti_at_[hidden]> wrote:

> Am 14.02.2014 um 11:23 schrieb tmishima_at_[hidden]:
>
>> You've found it in the dream, interesting!
>
> It happens sometimes to get insights while dreaming:
>
> https://skeptics.stackexchange.com/questions/5317/was-the-periodic-table-discovered-in-a-dream-by-dmitri-mendeleyev
>
> -- Reuti
>
>
>> Tetsuya Mishima
>>
>>> Thanks - hit me in the middle of the night over here that we had missed
>> some options, but nice to find you had also seen it. Slightly modified
>> patch will be applied and brought over to 1.7.5
>>>
>>>
>>> On Feb 13, 2014, at 10:16 PM, tmishima_at_[hidden] wrote:
>>>
>>>>
>>>>
>>>>
>>>> Please try attached patch - from r30723.
>>>>
>>>> (See attached file: patch.rmaps_base_frame.from_r30723)
>>>>
>>>> Tetsuya Mishima
>>>>
>>>>> Thanks for prompt help.
>>>>> Could you please resent the patch as attachment which can be applied
>> with
>>>> "patch" command, my mail client messes long lines.
>>>>>
>>>>>
>>>>> On Fri, Feb 14, 2014 at 7:40 AM, <tmishima_at_[hidden]>wrote:
>>>>>
>>>>>
>>>>> Thanks. I'm not familiar with mindist mapper. But obviously
>>>>> checking for ORTE_MAPPING_BYDIST is missing. In addition,
>>>>> ORTE_MAPPING_PPR is missing again by my mistake.
>>>>>
>>>>> Please try this patch.
>>>>>
>>>>> if OPAL_HAVE_HWLOC
>>>>> } else if (ORTE_MAPPING_BYCORE == ORTE_GET_MAPPING_POLICY
>>>>> (mapping)) {
>>>>> ORTE_SET_RANKING_POLICY(tmp, ORTE_RANK_BY_CORE);
>>>>> } else if (ORTE_MAPPING_BYL1CACHE ==
>> ORTE_GET_MAPPING_POLICY
>>>>> (mapping)) {
>>>>> ORTE_SET_RANKING_POLICY(tmp, ORTE_RANK_BY_L1CACHE);
>>>>> } else if (ORTE_MAPPING_BYL2CACHE ==
>> ORTE_GET_MAPPING_POLICY
>>>>> (mapping)) {
>>>>> ORTE_SET_RANKING_POLICY(tmp, ORTE_RANK_BY_L2CACHE);
>>>>> } else if (ORTE_MAPPING_BYL3CACHE ==
>> ORTE_GET_MAPPING_POLICY
>>>>> (mapping)) {
>>>>> ORTE_SET_RANKING_POLICY(tmp, ORTE_RANK_BY_L3CACHE);
>>>>> } else if (ORTE_MAPPING_BYSOCKET ==
>> ORTE_GET_MAPPING_POLICY
>>>>> (mapping)) {
>>>>> ORTE_SET_RANKING_POLICY(tmp, ORTE_RANK_BY_SOCKET);
>>>>> } else if (ORTE_MAPPING_BYNUMA == ORTE_GET_MAPPING_POLICY
>>>>> (mapping)) {
>>>>> ORTE_SET_RANKING_POLICY(tmp, ORTE_RANK_BY_NUMA);
>>>>> } else if (ORTE_MAPPING_BYBOARD == ORTE_GET_MAPPING_POLICY
>>>>> (mapping)) {
>>>>> ORTE_SET_RANKING_POLICY(tmp, ORTE_RANK_BY_BOARD);
>>>>> } else if (ORTE_MAPPING_BYHWTHREAD ==
>> ORTE_GET_MAPPING_POLICY
>>>>> (mapping)) {
>>>>> ORTE_SET_RANKING_POLICY(tmp, ORTE_RANK_BY_HWTHREAD);
>>>>> } else if (ORTE_MAPPING_PPR == ORTE_GET_MAPPING_POLICY
>>>>> (mapping)) {
>>>>> ORTE_SET_RANKING_POLICY(tmp, ORTE_RANK_BY_SLOT);
>>>>> } else if (ORTE_MAPPING_BYDIST == ORTE_GET_MAPPING_POLICY
>>>>> (mapping)) {
>>>>> ORTE_SET_RANKING_POLICY(tmp, ORTE_RANK_BY_SLOT);
>>>>> #endif
>>>>>
>>>>> Tetsuya Mishima
>>>>>
>>>>>> Hi,
>>>>>> after this patch we get this in jenkins:
>>>>>>
>>>>>> 07:03:15 [vegas12.mtr.labs.mlnx:01646] [[26922,0],0] ORTE_ERROR_LOG:
>>>> Not
>>>>> implemented in file rmaps_mindist_module.c at line 39107:03:15
>>>>> [vegas12.mtr.labs.mlnx:01646] [[26922,0],0] ORTE_ERROR_LOG: Not
>>>>>> implemented in file base/rmaps_base_map_job.c at line 285
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Feb 14, 2014 at 6:35 AM, <tmishima_at_[hidden]>wrote:
>>>>>>
>>>>>>
>>>>>> Sorry, one more shot - byslot was dropped!
>>>>>>
>>>>>> if (NULL == spec) {
>>>>>> /* check for map-by object directives - we set the
>>>>>> * ranking to match if one was given
>>>>>> */
>>>>>> if (ORTE_MAPPING_GIVEN & ORTE_GET_MAPPING_DIRECTIVE(mapping))
>> {
>>>>>> if (ORTE_MAPPING_BYSLOT == ORTE_GET_MAPPING_POLICY
>>>> (mapping))
>>>>> {
>>>>>> ORTE_SET_RANKING_POLICY(tmp, ORTE_RANK_BY_SLOT);
>>>>>> } else if (ORTE_MAPPING_BYNODE == ORTE_GET_MAPPING_POLICY
>>>>>> (mapping)) {
>>>>>> ORTE_SET_RANKING_POLICY(tmp, ORTE_RANK_BY_NODE);
>>>>>> #if OPAL_HAVE_HWLOC
>>>>>> } else if (ORTE_MAPPING_BYCORE == ORTE_GET_MAPPING_POLICY
>>>>>> (mapping)) {
>>>>>> ORTE_SET_RANKING_POLICY(tmp, ORTE_RANK_BY_CORE);
>>>>>> } else if (ORTE_MAPPING_BYL1CACHE ==
>>>> ORTE_GET_MAPPING_POLICY
>>>>>> (mapping)) {
>>>>>> ORTE_SET_RANKING_POLICY(tmp, ORTE_RANK_BY_L1CACHE);
>>>>>> } else if (ORTE_MAPPING_BYL2CACHE ==
>>>> ORTE_GET_MAPPING_POLICY
>>>>>> (mapping)) {
>>>>>> ORTE_SET_RANKING_POLICY(tmp, ORTE_RANK_BY_L2CACHE);
>>>>>> } else if (ORTE_MAPPING_BYL3CACHE ==
>>>> ORTE_GET_MAPPING_POLICY
>>>>>> (mapping)) {
>>>>>> ORTE_SET_RANKING_POLICY(tmp, ORTE_RANK_BY_L3CACHE);
>>>>>> } else if (ORTE_MAPPING_BYSOCKET ==
>> ORTE_GET_MAPPING_POLICY
>>>>>> (mapping)) {
>>>>>> ORTE_SET_RANKING_POLICY(tmp, ORTE_RANK_BY_SOCKET);
>>>>>> } else if (ORTE_MAPPING_BYNUMA == ORTE_GET_MAPPING_POLICY
>>>>>> (mapping)) {
>>>>>> ORTE_SET_RANKING_POLICY(tmp, ORTE_RANK_BY_NUMA);
>>>>>> } else if (ORTE_MAPPING_BYBOARD ==
>> ORTE_GET_MAPPING_POLICY
>>>>>> (mapping)) {
>>>>>> ORTE_SET_RANKING_POLICY(tmp, ORTE_RANK_BY_BOARD);
>>>>>> } else if (ORTE_MAPPING_BYHWTHREAD ==
>>>> ORTE_GET_MAPPING_POLICY
>>>>>> (mapping)) {
>>>>>> ORTE_SET_RANKING_POLICY(tmp, ORTE_RANK_BY_HWTHREAD);
>>>>>> #endif
>>>>>>
>>>>>> Tetusya Mishima
>>>>>>
>>>>>>> I've found it. Please add 2 lines(770, 771) in rmaps_base_frame.c:
>>>>>>>
>>>>>>> 747 if (NULL == spec) {
>>>>>>> 748 /* check for map-by object directives - we set the
>>>>>>> 749 * ranking to match if one was given
>>>>>>> 750 */
>>>>>>> 751 if (ORTE_MAPPING_GIVEN & ORTE_GET_MAPPING_DIRECTIVE
>>>>>>> (mapping)) {
>>>>>>> 752 if (ORTE_MAPPING_BYCORE == ORTE_GET_MAPPING_POLICY
>>>>>>> (mapping)) {
>>>>>>> 753 ORTE_SET_RANKING_POLICY(tmp,
>> ORTE_RANK_BY_CORE);
>>>>>>> 754 } else if (ORTE_MAPPING_BYNODE ==
>>>>>>> ORTE_GET_MAPPING_POLICY(mapping)) {
>>>>>>> 755 ORTE_SET_RANKING_POLICY(tmp,
>> ORTE_RANK_BY_NODE);
>>>>>>> 756 } else if (ORTE_MAPPING_BYL1CACHE ==
>>>>>>> ORTE_GET_MAPPING_POLICY(mapping)) {
>>>>>>> 757 ORTE_SET_RANKING_POLICY(tmp,
>>>>> ORTE_RANK_BY_L1CACHE);
>>>>>>> 758 } else if (ORTE_MAPPING_BYL2CACHE ==
>>>>>>> ORTE_GET_MAPPING_POLICY(mapping)) {
>>>>>>> 759 ORTE_SET_RANKING_POLICY(tmp,
>>>>> ORTE_RANK_BY_L2CACHE);
>>>>>>> 760 } else if (ORTE_MAPPING_BYL3CACHE ==
>>>>>>> ORTE_GET_MAPPING_POLICY(mapping)) {
>>>>>>> 761 ORTE_SET_RANKING_POLICY(tmp,
>>>>> ORTE_RANK_BY_L3CACHE);
>>>>>>> 762 } else if (ORTE_MAPPING_BYSOCKET ==
>>>>>>> ORTE_GET_MAPPING_POLICY(mapping)) {
>>>>>>> 763 ORTE_SET_RANKING_POLICY(tmp,
>>>> ORTE_RANK_BY_SOCKET);
>>>>>>> 764 } else if (ORTE_MAPPING_BYNUMA ==
>>>>>>> ORTE_GET_MAPPING_POLICY(mapping)) {
>>>>>>> 765 ORTE_SET_RANKING_POLICY(tmp,
>> ORTE_RANK_BY_NUMA);
>>>>>>> 766 } else if (ORTE_MAPPING_BYBOARD ==
>>>>>>> ORTE_GET_MAPPING_POLICY(mapping)) {
>>>>>>> 767 ORTE_SET_RANKING_POLICY(tmp,
>>>> ORTE_RANK_BY_BOARD);
>>>>>>> 768 } else if (ORTE_MAPPING_BYHWTHREAD ==
>>>>>>> ORTE_GET_MAPPING_POLICY(mapping)) {
>>>>>>> 769 ORTE_SET_RANKING_POLICY(tmp,
>>>>>>> ORTE_RANK_BY_HWTHREAD);
>>>>>>> 770 } else if (ORTE_MAPPING_PPR ==
>>>> ORTE_GET_MAPPING_POLICY
>>>>>>> (mapping)) {
>>>>>>> 771 ORTE_SET_RANKING_POLICY(tmp,
>> ORTE_RANK_BY_SLOT);
>>>>>>> 772 }
>>>>>>>
>>>>>>> Tetsuya Mishima
>>>>>>>
>>>>>>>> You are welcome, Ralph.
>>>>>>>>
>>>>>>>> But, after fixing it, I'm facing another problem whin I use ppr
>>>>> option:
>>>>>>>> [mishima_at_manage openmpi-1.7.4]$ mpirun -np 2 -map-by ppr:1:socket
>>>>>>> -bind-to
>>>>>>>> socket -report-bindings ~/mis/openmpi/demos/m
>>>>>>>> yprog
>>>>>>>> [manage.cluster:28057] [[25570,0],0] ORTE_ERROR_LOG: Not
>>>> implemented
>>>>> in
>>>>>>>> file rmaps_ppr.c at line 389
>>>>>>>> [manage.cluster:28057] [[25570,0],0] ORTE_ERROR_LOG: Not
>>>> implemented
>>>>> in
>>>>>>>> file base/rmaps_base_map_job.c at line 285
>>>>>>>>
>>>>>>>> I confirmed it worked when it reverted back.
>>>>>>>> I'm a little bit confused. Could you take a look?
>>>>>>>>
>>>>>>>> Tetsuya Mishima
>>>>>>>>
>>>>>>>>> Thanks - these used to be bitmaps, but changed when we started
>>>>>> getting
>>>>>>> so
>>>>>>>> many options. Sadly, they are very rarely used, so bugs like this
>>>> can
>>>>>> go
>>>>>>>> unnoticed for long times. Appreciate you taking such
>>>>>>>>> a close look at them.
>>>>>>>>>
>>>>>>>>> Ralph
>>>>>>>>>
>>>>>>>>> On Feb 13, 2014, at 4:55 PM, tmishima_at_[hidden] wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Hi Ralph,
>>>>>>>>>>
>>>>>>>>>> I would report one more finding in openmpi-1.7.5a1.
>>>>>>>>>>
>>>>>>>>>> Because ORTE_MAPPING_BY...s are not a bit field expression,
>>>>>>>>>> at orte_rmaps_base_set_ranking_policy in rmaps_base_frame.c
>>>>>>>>>> you should not use "&" to compare them:
>>>>>>>>>>
>>>>>>>>>> 747 if (NULL == spec) {
>>>>>>>>>> 748 /* check for map-by object directives - we set
>>>> the
>>>>>>>>>> 749 * ranking to match if one was given
>>>>>>>>>> 750 */
>>>>>>>>>> 751 if (ORTE_MAPPING_GIVEN &
>>>>> ORTE_GET_MAPPING_DIRECTIVE
>>>>>>>>>> (mapping)) {
>>>>>>>>>> 752 if (ORTE_MAPPING_BYCORE ==
>>>>>> ORTE_GET_MAPPING_POLICY
>>>>>>>>>> (mapping)) {
>>>>>>>>>> 753 ORTE_SET_RANKING_POLICY(tmp,
>>>>>>> ORTE_RANK_BY_CORE);
>>>>>>>>>> 754 } else if (ORTE_MAPPING_BYNODE ==
>>>>>>>>>> ORTE_GET_MAPPING_POLICY(mapping)) {
>>>>>>>>>> 755 ORTE_SET_RANKING_POLICY(tmp,
>>>>>>> ORTE_RANK_BY_NODE);
>>>>>>>>>> 756 } else if (ORTE_MAPPING_BYL1CACHE ==
>>>>>>>>>> ORTE_GET_MAPPING_POLICY(mapping)) {
>>>>>>>>>> 757 ORTE_SET_RANKING_POLICY(tmp,
>>>>>>>> ORTE_RANK_BY_L1CACHE);
>>>>>>>>>> 758 } else if (ORTE_MAPPING_BYL2CACHE ==
>>>>>>>>>> ORTE_GET_MAPPING_POLICY(mapping)) {
>>>>>>>>>> 759 ORTE_SET_RANKING_POLICY(tmp,
>>>>>>>> ORTE_RANK_BY_L2CACHE);
>>>>>>>>>> 760 } else if (ORTE_MAPPING_BYL3CACHE ==
>>>>>>>>>> ORTE_GET_MAPPING_POLICY(mapping)) {
>>>>>>>>>> 761 ORTE_SET_RANKING_POLICY(tmp,
>>>>>>>> ORTE_RANK_BY_L3CACHE);
>>>>>>>>>> 762 } else if (ORTE_MAPPING_BYSOCKET ==
>>>>>>>>>> ORTE_GET_MAPPING_POLICY(mapping)) {
>>>>>>>>>> 763 ORTE_SET_RANKING_POLICY(tmp,
>>>>>>>> ORTE_RANK_BY_SOCKET);
>>>>>>>>>> 764 } else if (ORTE_MAPPING_BYNUMA ==
>>>>>>>>>> ORTE_GET_MAPPING_POLICY(mapping)) {
>>>>>>>>>> 765 ORTE_SET_RANKING_POLICY(tmp,
>>>>>>> ORTE_RANK_BY_NUMA);
>>>>>>>>>> 766 } else if (ORTE_MAPPING_BYBOARD ==
>>>>>>>>>> ORTE_GET_MAPPING_POLICY(mapping)) {
>>>>>>>>>> 767 ORTE_SET_RANKING_POLICY(tmp,
>>>>>>>> ORTE_RANK_BY_BOARD);
>>>>>>>>>> 768 } else if (ORTE_MAPPING_BYHWTHREAD ==
>>>>>>>>>> ORTE_GET_MAPPING_POLICY(mapping)) {
>>>>>>>>>> 769 ORTE_SET_RANKING_POLICY(tmp,
>>>>>>>>>> ORTE_RANK_BY_HWTHREAD);
>>>>>>>>>> 770 }
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Tetsuya Mishima
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> users mailing list
>>>>>>>>>> users_at_[hidden]
>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> users_at_[hidden]
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> users_at_[hidden]
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> users_at_[hidden]
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> users_at_[hidden]
>>>>>>
>>>>>
>>>>
>> http://www.open-mpi.org/mailman/listinfo.cgi/users_______________________________________________
>>
>>>>
>>>>>
>>>>>> users mailing list
>>>>>> users_at_[hidden]http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>>
>>>>
>> http://www.open-mpi.org/mailman/listinfo.cgi/users_______________________________________________
>>
>>>>
>>>>> users mailing list
>>>>> users_at_[hidden]http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>> <patch.rmaps_base_frame.from_r30723>_______________________________________________
>>
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users