Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] openmpi-1.7.4rc2r30425 produces unexpected output
From: tmishima_at_[hidden]
Date: 2014-01-27 19:08:11


Thanks, Ralph. I quickly checked the fix. It worked fine for me.

Tetsuya Mishima

> I fixed that in today's final cleanup
>
> On Jan 27, 2014, at 3:17 PM, tmishima_at_[hidden] wrote:
>
>
>
> As for the NEWS - it is actually already correct. We default to map-by
> core, not slot, as of 1.7.4.
>
> Is it correct? As far as I browse the source code, map-by slot is used if
> np <=2.
>
> [mishima_at_manage openmpi-1.7.4rc2r30425]$ cat -n
> orte/mca/rmaps/base/rmaps_base_map_job.c
> ...
>   107              /* default based on number of procs */
>   108              if (nprocs <= 2) {
>   109                  opal_output_verbose(5,
> orte_rmaps_base_framework.framework_output,
>   110                                      "mca:rmaps mapping not given -
> using byslot");
>   111                  ORTE_SET_MAPPING_POLICY(map->mapping,
> ORTE_MAPPING_BYSLOT);
>   112              } else {
>   113                  opal_output_verbose(5,
> orte_rmaps_base_framework.framework_output,
>   114                                      "mca:rmaps mapping not given -
> using bysocket");
>   115                  ORTE_SET_MAPPING_POLICY(map->mapping,
> ORTE_MAPPING_BYSOCKET);
>   116              }
>
> Regards,
> Tetsuya Mishima
>
> On Jan 26, 2014, at 3:02 PM, tmishima_at_[hidden] wrote:
>
>
> Hi Ralph,
>
> I tried latest nightly snapshots of openmpi-1.7.4rc2r30425.tar.gz.
> Almost everything works fine, except that the unexpected output appears
> as below:
>
> [mishima_at_node04 ~]$ mpirun -cpus-per-proc 4 ~/mis/openmpi/demos/myprog
> App launch reported: 3 (out of 3) daemons - 8 (out of 12) procs
> ...
>
> You dropped the if-statement checking "orte_report_launch_progress" in
> plm_base_receive.c @ r30423, which causes the problem.
>
> --- orte/mca/plm/base/plm_base_receive.c.org2014-01-25
> 11:51:59.000000000 +0900
> +++ orte/mca/plm/base/plm_base_receive.c2014-01-26
> 12:20:10.000000000
> +0900
> @@ -315,9 +315,11 @@
>            /* record that we heard back from a daemon during app
> launch
> */
>            if (running && NULL != jdata) {
>                jdata->num_daemons_reported++;
> -                if (0 == jdata->num_daemons_reported % 100 ||
> -                    jdata->num_daemons_reported ==
> orte_process_info.num_procs) {
> -                    ORTE_ACTIVATE_JOB_STATE(jdata,
> ORTE_JOB_STATE_REPORT_PROGRESS);
> +                if (orte_report_launch_progress) {
> +                    if (0 == jdata->num_daemons_reported % 100 ||
> +                        jdata->num_daemons_reported ==
> orte_process_info.num_procs) {
> +                        ORTE_ACTIVATE_JOB_STATE(jdata,
> ORTE_JOB_STATE_REPORT_PROGRESS);
> +                    }
>                }
>            }
>            /* prepare for next job */
>
> Regards,
> Tetsuya Mishima
>
> P.S. It's also better to change the line 65 in NEWS.
>
> ...
> 64   * Mapping:
> 65   *   if #procs <= 2, default to map-by core  -> map-by slot
>                                   ^^^^^^^^^^^
> 66   *   if #procs > 2, default to map-by socket
> ...
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
>
http://www.open-mpi.org/mailman/listinfo.cgi/users_______________________________________________

> users mailing list
> users_at_[hidden]http://www.open-mpi.org/mailman/listinfo.cgi/users