Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] openmpi-1.7.4rc2r30425 produces unexpected output
From: Ralph Castain (rhc_at_[hidden])
Date: 2014-01-27 19:14:56


Kewl - thanks!

On Jan 27, 2014, at 4:08 PM, tmishima_at_[hidden] wrote:

>
>
> Thanks, Ralph. I quickly checked the fix. It worked fine for me.
>
> Tetsuya Mishima
>
>> I fixed that in today's final cleanup
>>
>> On Jan 27, 2014, at 3:17 PM, tmishima_at_[hidden] wrote:
>>
>>
>>
>> As for the NEWS - it is actually already correct. We default to map-by
>> core, not slot, as of 1.7.4.
>>
>> Is it correct? As far as I browse the source code, map-by slot is used if
>> np <=2.
>>
>> [mishima_at_manage openmpi-1.7.4rc2r30425]$ cat -n
>> orte/mca/rmaps/base/rmaps_base_map_job.c
>> ...
>> 107 /* default based on number of procs */
>> 108 if (nprocs <= 2) {
>> 109 opal_output_verbose(5,
>> orte_rmaps_base_framework.framework_output,
>> 110 "mca:rmaps mapping not given -
>> using byslot");
>> 111 ORTE_SET_MAPPING_POLICY(map->mapping,
>> ORTE_MAPPING_BYSLOT);
>> 112 } else {
>> 113 opal_output_verbose(5,
>> orte_rmaps_base_framework.framework_output,
>> 114 "mca:rmaps mapping not given -
>> using bysocket");
>> 115 ORTE_SET_MAPPING_POLICY(map->mapping,
>> ORTE_MAPPING_BYSOCKET);
>> 116 }
>>
>> Regards,
>> Tetsuya Mishima
>>
>> On Jan 26, 2014, at 3:02 PM, tmishima_at_[hidden] wrote:
>>
>>
>> Hi Ralph,
>>
>> I tried latest nightly snapshots of openmpi-1.7.4rc2r30425.tar.gz.
>> Almost everything works fine, except that the unexpected output appears
>> as below:
>>
>> [mishima_at_node04 ~]$ mpirun -cpus-per-proc 4 ~/mis/openmpi/demos/myprog
>> App launch reported: 3 (out of 3) daemons - 8 (out of 12) procs
>> ...
>>
>> You dropped the if-statement checking "orte_report_launch_progress" in
>> plm_base_receive.c @ r30423, which causes the problem.
>>
>> --- orte/mca/plm/base/plm_base_receive.c.org2014-01-25
>> 11:51:59.000000000 +0900
>> +++ orte/mca/plm/base/plm_base_receive.c2014-01-26
>> 12:20:10.000000000
>> +0900
>> @@ -315,9 +315,11 @@
>> /* record that we heard back from a daemon during app
>> launch
>> */
>> if (running && NULL != jdata) {
>> jdata->num_daemons_reported++;
>> - if (0 == jdata->num_daemons_reported % 100 ||
>> - jdata->num_daemons_reported ==
>> orte_process_info.num_procs) {
>> - ORTE_ACTIVATE_JOB_STATE(jdata,
>> ORTE_JOB_STATE_REPORT_PROGRESS);
>> + if (orte_report_launch_progress) {
>> + if (0 == jdata->num_daemons_reported % 100 ||
>> + jdata->num_daemons_reported ==
>> orte_process_info.num_procs) {
>> + ORTE_ACTIVATE_JOB_STATE(jdata,
>> ORTE_JOB_STATE_REPORT_PROGRESS);
>> + }
>> }
>> }
>> /* prepare for next job */
>>
>> Regards,
>> Tetsuya Mishima
>>
>> P.S. It's also better to change the line 65 in NEWS.
>>
>> ...
>> 64 * Mapping:
>> 65 * if #procs <= 2, default to map-by core -> map-by slot
>> ^^^^^^^^^^^
>> 66 * if #procs > 2, default to map-by socket
>> ...
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>>
> http://www.open-mpi.org/mailman/listinfo.cgi/users_______________________________________________
>
>> users mailing list
>> users_at_[hidden]http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users