Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] new map-by-obj has a problem
From: tmishima_at_[hidden]
Date: 2014-02-28 21:05:13


Hi Ralph, I understood what you meant.

I often use float for our applicatoin.
float c = (float)(unsinged int a - unsinged int b) could
be very huge number, if a < b. So I always carefully cast to
int from unsigned int when I subtract them. I didn't know/mind
inc d = (unsinged int a - unsinged int b) has no problem.
I noticed it by your suggestion, thanks.

Therefore, I think my fix is not necesarry.

Tetsuya

> Yes, indeed. In future, when we will have many many cores
> in the machine, we will have to take care of overrun of
> num_procs.
>
> Tetsuya
>
> > Cool - easily modified. Thanks!
> >
> > Of course, you understand (I'm sure) that the cast does nothing to
> protect the code from blowing up if we overrun the var. In other words,
if
> the unsigned var has wrapped, then casting it to int
> > won't help - you'll still get a negative integer, and the code will
> trash.
> >
> >
> > On Feb 28, 2014, at 3:43 PM, tmishima_at_[hidden] wrote:
> >
> > >
> > >
> > > Hi Ralph, I'm a litte bit late to your release.
> > >
> > > I found a minor mistake in byobj_span -integer casting problem.
> > >
> > > --- rmaps_rr_mappers.30892.c 2014-03-01 08:31:50 +0900
> > > +++ rmaps_rr_mappers.c 2014-03-01 08:33:22 +0900
> > > @@ -689,7 +689,7 @@
> > > }
> > >
> > > /* compute how many objs need an extra proc */
> > > - if (0 > (nxtra_objs = app->num_procs - (navg * nobjs))) {
> > > + if (0 > (nxtra_objs = (int)app->num_procs - (navg *
(int)nobjs)))
> {
> > > nxtra_objs = 0;
> > > }
> > >
> > > Tetsuya
> > >
> > >> Please take a look at
https://svn.open-mpi.org/trac/ompi/ticket/4317
> > >>
> > >>
> > >> On Feb 27, 2014, at 8:13 PM, tmishima_at_[hidden] wrote:
> > >>
> > >>>
> > >>>
> > >>> Hi Ralph, I can't operate our cluster for a few days, sorry.
> > >>>
> > >>> But now, I'm narrowing down the cause by browsing the source code.
> > >>>
> > >>> My best guess is the line 529. The opal_hwloc_base_get_obj_by_type
> will
> > >>> reset the object pointer to the first one when you move on to the
> next
> > >>> node.
> > >>>
> > >>> 529 if (NULL == (obj =
> > > opal_hwloc_base_get_obj_by_type
> > >>> (node->topology, target, cache_level, i, OPAL_HWLOC_AVAILABLE))) {
> > >>> 530 ORTE_ERROR_LOG(ORTE_ERR_NOT_FOUND);
> > >>> 531 return ORTE_ERR_NOT_FOUND;
> > >>> 532 }
> > >>>
> > >>> if node->slots=1, then nprocs is set as nprocs=1 in the second
pass:
> > >>>
> > >>> 495 nprocs = (node->slots - node->slots_inuse) /
> > >>> orte_rmaps_base.cpus_per_rank;
> > >>> 496 if (nprocs < 1) {
> > >>> 497 if (second_pass) {
> > >>> 498 /* already checked for oversubscription
> > > permission,
> > >>> so at least put
> > >>> 499 * one proc on it
> > >>> 500 */
> > >>> 501 nprocs = 1;
> > >>>
> > >>> Therefore, opal_hwloc_base_get_obj_by_type is called one by one at
> each
> > >>> node, which means
> > >>> the object we get is always first one.
> > >>>
> > >>> It's not elegant but I guess you need dummy calls of
> > >>> opal_hwloc_base_get_obj_by_type to
> > >>> move the object pointer to the right place or modify
> > >>> opal_hwloc_base_get_obj_by_type itself.
> > >>>
> > >>> Tetsuya
> > >>>
> > >>>> I'm having trouble seeing why it is failing, so I added some more
> > > debug
> > >>> output. Could you run the failure case again with -mca
> > > rmaps_base_verbose
> > >>> 10?
> > >>>>
> > >>>> Thanks
> > >>>> Ralph
> > >>>>
> > >>>> On Feb 27, 2014, at 6:11 PM, tmishima_at_[hidden] wrote:
> > >>>>
> > >>>>>
> > >>>>>
> > >>>>> Just checking the difference, not so significant meaning...
> > >>>>>
> > >>>>> Anyway, I guess it's due to the behavior when slot counts is
> missing
> > >>>>> (regarded as slots=1) and it's oversubscribed unintentionally.
> > >>>>>
> > >>>>> I'm going out now, so I can't verify it quickly. If I provide the
> > >>>>> correct slot counts, it wll work, I guess. How do you think?
> > >>>>>
> > >>>>> Tetsuya
> > >>>>>
> > >>>>>> "restore" in what sense?
> > >>>>>>
> > >>>>>> On Feb 27, 2014, at 4:10 PM, tmishima_at_[hidden] wrote:
> > >>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> Hi Ralph, this is just for your information.
> > >>>>>>>
> > >>>>>>> I tried to restore previous orte_rmaps_rr_byobj. Then I gets
the
> > >>> result
> > >>>>>>> below with this command line:
> > >>>>>>>
> > >>>>>>> mpirun -np 8 -host node05,node06 -report-bindings -map-by
> > > socket:pe=2
> > >>>>>>> -display-map -bind-to core:overload-allowed
> > >>> ~/mis/openmpi/demos/myprog
> > >>>>>>> Data for JOB [31184,1] offset 0
> > >>>>>>>
> > >>>>>>> ======================== JOB MAP ========================
> > >>>>>>>
> > >>>>>>> Data for node: node05 Num slots: 1 Max slots: 0 Num
procs:
> 7
> > >>>>>>> Process OMPI jobid: [31184,1] App: 0 Process rank: 0
> > >>>>>>> Process OMPI jobid: [31184,1] App: 0 Process rank: 2
> > >>>>>>> Process OMPI jobid: [31184,1] App: 0 Process rank: 4
> > >>>>>>> Process OMPI jobid: [31184,1] App: 0 Process rank: 6
> > >>>>>>> Process OMPI jobid: [31184,1] App: 0 Process rank: 1
> > >>>>>>> Process OMPI jobid: [31184,1] App: 0 Process rank: 3
> > >>>>>>> Process OMPI jobid: [31184,1] App: 0 Process rank: 5
> > >>>>>>>
> > >>>>>>> Data for node: node06 Num slots: 1 Max slots: 0 Num
procs:
> 1
> > >>>>>>> Process OMPI jobid: [31184,1] App: 0 Process rank: 7
> > >>>>>>>
> > >>>>>>> =============================================================
> > >>>>>>> [node06.cluster:18857] MCW rank 7 bound to socket 0[core 0[hwt
> 0]],
> > >>>>> socket
> > >>>>>>> 0[core 1[hwt 0]]: [B/B/./.][./././.]
> > >>>>>>> [node05.cluster:21399] MCW rank 3 bound to socket 1[core 6[hwt
> 0]],
> > >>>>> socket
> > >>>>>>> 1[core 7[hwt 0]]: [./././.][././B/B]
> > >>>>>>> [node05.cluster:21399] MCW rank 4 bound to socket 0[core 0[hwt
> 0]],
> > >>>>> socket
> > >>>>>>> 0[core 1[hwt 0]]: [B/B/./.][./././.]
> > >>>>>>> [node05.cluster:21399] MCW rank 5 bound to socket 1[core 4[hwt
> 0]],
> > >>>>> socket
> > >>>>>>> 1[core 5[hwt 0]]: [./././.][B/B/./.]
> > >>>>>>> [node05.cluster:21399] MCW rank 6 bound to socket 0[core 2[hwt
> 0]],
> > >>>>> socket
> > >>>>>>> 0[core 3[hwt 0]]: [././B/B][./././.]
> > >>>>>>> [node05.cluster:21399] MCW rank 0 bound to socket 0[core 0[hwt
> 0]],
> > >>>>> socket
> > >>>>>>> 0[core 1[hwt 0]]: [B/B/./.][./././.]
> > >>>>>>> [node05.cluster:21399] MCW rank 1 bound to socket 1[core 4[hwt
> 0]],
> > >>>>> socket
> > >>>>>>> 1[core 5[hwt 0]]: [./././.][B/B/./.]
> > >>>>>>> [node05.cluster:21399] MCW rank 2 bound to socket 0[core 2[hwt
> 0]],
> > >>>>> socket
> > >>>>>>> 0[core 3[hwt 0]]: [././B/B][./././.]
> > >>>>>>> ....
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> Then I add "-hostfile pbs_hosts" and the result is:
> > >>>>>>>
> > >>>>>>> [mishima_at_manage work]$cat pbs_hosts
> > >>>>>>> node05 slots=8
> > >>>>>>> node06 slots=8
> > >>>>>>> [mishima_at_manage work]$ mpirun -np 8 -hostfile ~/work/pbs_hosts
> > >>>>>>> -report-bindings -map-by socket:pe=2 -display-map
> > >>>>>>> ~/mis/openmpi/demos/myprog
> > >>>>>>> Data for JOB [30254,1] offset 0
> > >>>>>>>
> > >>>>>>> ======================== JOB MAP ========================
> > >>>>>>>
> > >>>>>>> Data for node: node05 Num slots: 8 Max slots: 0 Num
procs:
> 4
> > >>>>>>> Process OMPI jobid: [30254,1] App: 0 Process rank: 0
> > >>>>>>> Process OMPI jobid: [30254,1] App: 0 Process rank: 2
> > >>>>>>> Process OMPI jobid: [30254,1] App: 0 Process rank: 1
> > >>>>>>> Process OMPI jobid: [30254,1] App: 0 Process rank: 3
> > >>>>>>>
> > >>>>>>> Data for node: node06 Num slots: 8 Max slots: 0 Num
procs:
> 4
> > >>>>>>> Process OMPI jobid: [30254,1] App: 0 Process rank: 4
> > >>>>>>> Process OMPI jobid: [30254,1] App: 0 Process rank: 6
> > >>>>>>> Process OMPI jobid: [30254,1] App: 0 Process rank: 5
> > >>>>>>> Process OMPI jobid: [30254,1] App: 0 Process rank: 7
> > >>>>>>>
> > >>>>>>> =============================================================
> > >>>>>>> [node05.cluster:21501] MCW rank 2 bound to socket 0[core 2[hwt
> 0]],
> > >>>>> socket
> > >>>>>>> 0[core 3[hwt 0]]: [././B/B][./././.]
> > >>>>>>> [node05.cluster:21501] MCW rank 3 bound to socket 1[core 6[hwt
> 0]],
> > >>>>> socket
> > >>>>>>> 1[core 7[hwt 0]]: [./././.][././B/B]
> > >>>>>>> [node05.cluster:21501] MCW rank 0 bound to socket 0[core 0[hwt
> 0]],
> > >>>>> socket
> > >>>>>>> 0[core 1[hwt 0]]: [B/B/./.][./././.]
> > >>>>>>> [node05.cluster:21501] MCW rank 1 bound to socket 1[core 4[hwt
> 0]],
> > >>>>> socket
> > >>>>>>> 1[core 5[hwt 0]]: [./././.][B/B/./.]
> > >>>>>>> [node06.cluster:18935] MCW rank 6 bound to socket 0[core 2[hwt
> 0]],
> > >>>>> socket
> > >>>>>>> 0[core 3[hwt 0]]: [././B/B][./././.]
> > >>>>>>> [node06.cluster:18935] MCW rank 7 bound to socket 1[core 6[hwt
> 0]],
> > >>>>> socket
> > >>>>>>> 1[core 7[hwt 0]]: [./././.][././B/B]
> > >>>>>>> [node06.cluster:18935] MCW rank 4 bound to socket 0[core 0[hwt
> 0]],
> > >>>>> socket
> > >>>>>>> 0[core 1[hwt 0]]: [B/B/./.][./././.]
> > >>>>>>> [node06.cluster:18935] MCW rank 5 bound to socket 1[core 4[hwt
> 0]],
> > >>>>> socket
> > >>>>>>> 1[core 5[hwt 0]]: [./././.][B/B/./.]
> > >>>>>>> ....
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> I think previous version's behavior would be close to what I
> > > expect.
> > >>>>>>>
> > >>>>>>> Tetusya
> > >>>>>>>
> > >>>>>>>> They have 4 cores/socket and 2 sockets, totally 4 X 2 = 8
cores,
> > >>> each.
> > >>>>>>>>
> > >>>>>>>> Here is the output of lstopo.
> > >>>>>>>>
> > >>>>>>>> mishima_at_manage round_robin]$ rsh node05
> > >>>>>>>> Last login: Tue Feb 18 15:10:15 from manage
> > >>>>>>>> [mishima_at_node05 ~]$ lstopo
> > >>>>>>>> Machine (32GB)
> > >>>>>>>> NUMANode L#0 (P#0 16GB) + Socket L#0 + L3 L#0 (6144KB)
> > >>>>>>>> L2 L#0 (512KB) + L1d L#0 (64KB) + L1i L#0 (64KB) + Core L#0 +
PU
> > > L#0
> > >>>>>>>> (P#0)
> > >>>>>>>> L2 L#1 (512KB) + L1d L#1 (64KB) + L1i L#1 (64KB) + Core L#1 +
PU
> > > L#1
> > >>>>>>>> (P#1)
> > >>>>>>>> L2 L#2 (512KB) + L1d L#2 (64KB) + L1i L#2 (64KB) + Core L#2 +
PU
> > > L#2
> > >>>>>>>> (P#2)
> > >>>>>>>> L2 L#3 (512KB) + L1d L#3 (64KB) + L1i L#3 (64KB) + Core L#3 +
PU
> > > L#3
> > >>>>>>>> (P#3)
> > >>>>>>>> NUMANode L#1 (P#1 16GB) + Socket L#1 + L3 L#1 (6144KB)
> > >>>>>>>> L2 L#4 (512KB) + L1d L#4 (64KB) + L1i L#4 (64KB) + Core L#4 +
PU
> > > L#4
> > >>>>>>>> (P#4)
> > >>>>>>>> L2 L#5 (512KB) + L1d L#5 (64KB) + L1i L#5 (64KB) + Core L#5 +
PU
> > > L#5
> > >>>>>>>> (P#5)
> > >>>>>>>> L2 L#6 (512KB) + L1d L#6 (64KB) + L1i L#6 (64KB) + Core L#6 +
PU
> > > L#6
> > >>>>>>>> (P#6)
> > >>>>>>>> L2 L#7 (512KB) + L1d L#7 (64KB) + L1i L#7 (64KB) + Core L#7 +
PU
> > > L#7
> > >>>>>>>> (P#7)
> > >>>>>>>> ....
> > >>>>>>>>
> > >>>>>>>> I foucused on byobj_span and bynode. I didn't notice byobj was
> > >>>>> modified,
> > >>>>>>>> sorry.
> > >>>>>>>>
> > >>>>>>>> Tetsuya
> > >>>>>>>>
> > >>>>>>>>> Hmmm..what does your node look like again (sockets and
cores)?
> > >>>>>>>>>
> > >>>>>>>>> On Feb 27, 2014, at 3:19 PM, tmishima_at_[hidden]
wrote:
> > >>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> Hi Ralph, I'm afraid to say your new "map-by obj" causes
> another
> > >>>>>>>> problem.
> > >>>>>>>>>>
> > >>>>>>>>>> I have overload message with this command line as shown
below:
> > >>>>>>>>>>
> > >>>>>>>>>> mpirun -np 8 -host node05,node06 -report-bindings -map-by
> > >>>>> socket:pe=2
> > >>>>>>>>>> -display-map ~/mis/openmpi/d
> > >>>>>>>>>> emos/myprog
> > >>>>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>
> > >>>
> > >
>
--------------------------------------------------------------------------
> > >>>>>>>>>> A request was made to bind to that would result in binding
> more
> > >>>>>>>>>> processes than cpus on a resource:
> > >>>>>>>>>>
> > >>>>>>>>>> Bind to: CORE
> > >>>>>>>>>> Node: node05
> > >>>>>>>>>> #processes: 2
> > >>>>>>>>>> #cpus: 1
> > >>>>>>>>>>
> > >>>>>>>>>> You can override this protection by adding the
> > > "overload-allowed"
> > >>>>>>>>>> option to your binding directive.
> > >>>>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>
> > >>>
> > >
>
--------------------------------------------------------------------------
> > >>>>>>>>>>
> > >>>>>>>>>> Then, I add "-bind-to core:overload-allowed" to see what
> > > happenes.
> > >>>>>>>>>>
> > >>>>>>>>>> mpirun -np 8 -host node05,node06 -report-bindings -map-by
> > >>>>> socket:pe=2
> > >>>>>>>>>> -display-map -bind-to core:o
> > >>>>>>>>>> verload-allowed ~/mis/openmpi/demos/myprog
> > >>>>>>>>>> Data for JOB [14398,1] offset 0
> > >>>>>>>>>>
> > >>>>>>>>>> ======================== JOB MAP
========================
> > >>>>>>>>>>
> > >>>>>>>>>> Data for node: node05 Num slots: 1 Max slots: 0 Num
> > > procs:
> > >>> 4
> > >>>>>>>>>> Process OMPI jobid: [14398,1] App: 0 Process rank: 0
> > >>>>>>>>>> Process OMPI jobid: [14398,1] App: 0 Process rank: 1
> > >>>>>>>>>> Process OMPI jobid: [14398,1] App: 0 Process rank: 2
> > >>>>>>>>>> Process OMPI jobid: [14398,1] App: 0 Process rank: 3
> > >>>>>>>>>>
> > >>>>>>>>>> Data for node: node06 Num slots: 1 Max slots: 0 Num
> > > procs:
> > >>> 4
> > >>>>>>>>>> Process OMPI jobid: [14398,1] App: 0 Process rank: 4
> > >>>>>>>>>> Process OMPI jobid: [14398,1] App: 0 Process rank: 5
> > >>>>>>>>>> Process OMPI jobid: [14398,1] App: 0 Process rank: 6
> > >>>>>>>>>> Process OMPI jobid: [14398,1] App: 0 Process rank: 7
> > >>>>>>>>>>
> > >>>>>>>>>>
=============================================================
> > >>>>>>>>>> [node06.cluster:18443] MCW rank 6 bound to socket 0[core 0
[hwt
> > >>> 0]],
> > >>>>>>>> socket
> > >>>>>>>>>> 0[core 1[hwt 0]]: [B/B/./.][./././.]
> > >>>>>>>>>> [node05.cluster:20901] MCW rank 2 bound to socket 0[core 0
[hwt
> > >>> 0]],
> > >>>>>>>> socket
> > >>>>>>>>>> 0[core 1[hwt 0]]: [B/B/./.][./././.]
> > >>>>>>>>>> [node06.cluster:18443] MCW rank 7 bound to socket 0[core 2
[hwt
> > >>> 0]],
> > >>>>>>>> socket
> > >>>>>>>>>> 0[core 3[hwt 0]]: [././B/B][./././.]
> > >>>>>>>>>> [node05.cluster:20901] MCW rank 3 bound to socket 0[core 2
[hwt
> > >>> 0]],
> > >>>>>>>> socket
> > >>>>>>>>>> 0[core 3[hwt 0]]: [././B/B][./././.]
> > >>>>>>>>>> [node06.cluster:18443] MCW rank 4 bound to socket 0[core 0
[hwt
> > >>> 0]],
> > >>>>>>>> socket
> > >>>>>>>>>> 0[core 1[hwt 0]]: [B/B/./.][./././.]
> > >>>>>>>>>> [node05.cluster:20901] MCW rank 0 bound to socket 0[core 0
[hwt
> > >>> 0]],
> > >>>>>>>> socket
> > >>>>>>>>>> 0[core 1[hwt 0]]: [B/B/./.][./././.]
> > >>>>>>>>>> [node06.cluster:18443] MCW rank 5 bound to socket 0[core 2
[hwt
> > >>> 0]],
> > >>>>>>>> socket
> > >>>>>>>>>> 0[core 3[hwt 0]]: [././B/B][./././.]
> > >>>>>>>>>> [node05.cluster:20901] MCW rank 1 bound to socket 0[core 2
[hwt
> > >>> 0]],
> > >>>>>>>> socket
> > >>>>>>>>>> 0[core 3[hwt 0]]: [././B/B][./././.]
> > >>>>>>>>>> Hello world from process 4 of 8
> > >>>>>>>>>> Hello world from process 2 of 8
> > >>>>>>>>>> Hello world from process 6 of 8
> > >>>>>>>>>> Hello world from process 0 of 8
> > >>>>>>>>>> Hello world from process 5 of 8
> > >>>>>>>>>> Hello world from process 1 of 8
> > >>>>>>>>>> Hello world from process 7 of 8
> > >>>>>>>>>> Hello world from process 3 of 8
> > >>>>>>>>>>
> > >>>>>>>>>> When I add "map-by obj:span", it works fine:
> > >>>>>>>>>>
> > >>>>>>>>>> mpirun -np 8 -host node05,node06 -report-bindings -map-by
> > >>>>>>>> socket:pe=2,span
> > >>>>>>>>>> -display-map ~/mis/ope
> > >>>>>>>>>> nmpi/demos/myprog
> > >>>>>>>>>> Data for JOB [14703,1] offset 0
> > >>>>>>>>>>
> > >>>>>>>>>> ======================== JOB MAP
========================
> > >>>>>>>>>>
> > >>>>>>>>>> Data for node: node05 Num slots: 1 Max slots: 0 Num
> > > procs:
> > >>> 4
> > >>>>>>>>>> Process OMPI jobid: [14703,1] App: 0 Process rank: 0
> > >>>>>>>>>> Process OMPI jobid: [14703,1] App: 0 Process rank: 2
> > >>>>>>>>>> Process OMPI jobid: [14703,1] App: 0 Process rank: 1
> > >>>>>>>>>> Process OMPI jobid: [14703,1] App: 0 Process rank: 3
> > >>>>>>>>>>> >>>>>>>>>> Data for node: node06 Num slots: 1 Max
slots: 0 Num
> > > procs:
> > >>> 4
> > >>>>>>>>>> Process OMPI jobid: [14703,1] App: 0 Process rank: 4
> > >>>>>>>>>> Process OMPI jobid: [14703,1] App: 0 Process rank: 6
> > >>>>>>>>>> Process OMPI jobid: [14703,1] App: 0 Process rank: 5
> > >>>>>>>>>> Process OMPI jobid: [14703,1] App: 0 Process rank: 7
> > >>>>>>>>>>
> > >>>>>>>>>>
=============================================================
> > >>>>>>>>>> [node06.cluster:18491] MCW rank 6 bound to socket 0[core 2
[hwt
> > >>> 0]],
> > >>>>>>>> socket
> > >>>>>>>>>> 0[core 3[hwt 0]]: [././B/B][./././.]
> > >>>>>>>>>> [node05.cluster:20949] MCW rank 2 bound to socket 0[core 2
[hwt
> > >>> 0]],
> > >>>>>>>> socket
> > >>>>>>>>>> 0[core 3[hwt 0]]: [././B/B][./././.]
> > >>>>>>>>>> [node06.cluster:18491] MCW rank 7 bound to socket 1[core 6
[hwt
> > >>> 0]],
> > >>>>>>>> socket>>>>>>>>>> 1[core 7[hwt 0]]: [./././.][././B/B]
> > >>>>>>>>>> [node05.cluster:20949] MCW rank 3 bound to socket 1[core 6
[hwt
> > >>> 0]],
> > >>>>>>>> socket
> > >>>>>>>>>> 1[core 7[hwt 0]]: [./././.][././B/B]
> > >>>>>>>>>> [node06.cluster:18491] MCW rank 4 bound to socket 0[core 0
[hwt
> > >>> 0]],
> > >>>>>>>> socket
> > >>>>>>>>>> 0[core 1[hwt 0]]: [B/B/./.][./././.]
> > >>>>>>>>>> [node05.cluster:20949] MCW rank 0 bound to socket 0[core 0
[hwt
> > >>> 0]],
> > >>>>>>>> socket
> > >>>>>>>>>> 0[core 1[hwt 0]]: [B/B/./.][./././.]
> > >>>>>>>>>> [node06.cluster:18491] MCW rank 5 bound to socket 1[core 4
[hwt
> > >>> 0]],
> > >>>>>>>> socket
> > >>>>>>>>>> 1[core 5[hwt 0]]: [./././.][B/B/./.]
> > >>>>>>>>>> [node05.cluster:20949] MCW rank 1 bound to socket 1[core 4
[hwt
> > >>> 0]],
> > >>>>>>>> socket
> > >>>>>>>>>> 1[core 5[hwt 0]]: [./././.][B/B/./.]
> > >>>>>>>>>> ....
> > >>>>>>>>>>
> > >>>>>>>>>> So, byobj_span would be okay. Of course, bynode and byslot
> > > should
> > >>> be
> > >>>>>>>> okay.
> > >>>>>>>>>> Could you take a look at orte_rmaps_rr_byobj again?
> > >>>>>>>>>>
> > >>>>>>>>>> Regards,
> > >>>>>>>>>> Tetsuya Mishima
> > >>>>>>>>>>
> > >>>>>>>>>> _______________________________________________
> > >>>>>>>>>> users mailing list
> > >>>>>>>>>> users_at_[hidden]
> > >>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >>>>>>>>>
> > >>>>>>>>> _______________________________________________
> > >>>>>>>>> users mailing list
> > >>>>>>>>> users_at_[hidden]
> > >>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >>>>>>>>
> > >>>>>>>> _______________________________________________
> > >>>>>>>> users mailing list
> > >>>>>>>> users_at_[hidden]
> > >>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >>>>>>>
> > >>>>>>> _______________________________________________
> > >>>>>>> users mailing list
> > >>>>>>> users_at_[hidden]>>>>>
> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >>>>>>
> > >>>>>> _______________________________________________
> > >>>>>> users mailing list
> > >>>>>> users_at_[hidden]
> > >>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >>>>>
> > >>>>> _______________________________________________
> > >>>>> users mailing list
> > >>>>> users_at_[hidden]
> > >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >>>>
> > >>>> _______________________________________________
> > >>>> users mailing list
> > >>>> users_at_[hidden]
> > >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >>>
> > >>> _______________________________________________
> > >>> users mailing list
> > >>> users_at_[hidden]
> > >>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >>
> > >> _______________________________________________
> > >> users mailing list
> > >> users_at_[hidden]
> > >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >
> > > _______________________________________________
> > > users mailing list
> > > users_at_[hidden]
> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users