Open MPI logo

Hardware Locality Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Development mailing list

Subject: [hwloc-devel] Patch file to let hwloc-distrib output the PUs starting from the last object
From: Jiri Hladky (hladky.jiri_at_[hidden])
Date: 2013-09-18 08:42:54


Hi Brice,
hi Samuel,

hopefully you still remember discussion we had regarding a proposed new
option
--reverse_direction
or perhaps better
--start_from_last_object

which would force hwloc-distrib to start assigning PUs from the last object
(rather than from PU#0 as the default)

I have now implemented the change on the latest trunk.

svn co http://svn.open-mpi.org/svn/hwloc/trunk hwloc-trunk

I had to modify only three functions

   - hwloc_distributev in helper.h
   - hwloc_distribute in helper.h
   - hwloc_bitmap_singlify in bitmap.c

The change is trivial. In helper.h I had to only change the direction of
loops, ie from

for(i=0;i<N;++i) {
  anything[i] =
to
for(i=N;i>0;--i)
  anything[i-1] =

(Please be careful here as "i" is unsigned int and it will wrap around when
doing i-- for i == 0 . Alternative code would be

for (size_t i = n-1; i < n ; --i) { ... }

which has the benefit that you don't need to change i to i-1 in the
loop body but it's for statement is less readable)

In hwloc_bitmap_singlify I had to

   - again change to loop direction
   - replace

int _ffs = hwloc_ffsl(w);
set->ulongs[i] = HWLOC_SUBBITMAP_CPU(_ffs-1);

with

int _fls = hwloc_flsl(w);
set->ulongs[i-1] = HWLOC_SUBBITMAP_CPU(_fls-1);

The patch file is attached.

Could you please review it if you agree to add it to the 1.8?

Other changes would require:

   - add the new command line option to hwloc-distrib
   - clone functions
   hwloc_distributev, hwloc_distribute, hwloc_bitmap_singlify to
   hwloc_distributev_reverse, hwloc_distribute_reverse,
hwloc_bitmap_singlify_reverseand when a command line option is
specified, call these new functions

I have tested the new code on 4 Socket system with 15 cores per socket and
30 PUs per socket and it works as expected:

utils/hwloc-distrib --single --taskset 64
0x800000000000000000000000000000
0x400000000000000000000000000000
0x100000000000000000000000000
0x20000000000000000000000
0x4000000000000000000

It will assign the jobs in this order

   1. last PU on the last core on Socket #4
   2. last PU on the last core Socket #3
   3. last PU on the last core Socket #2
   4. last PU on the last core Socket #1
   5. last PU on the second last core on Socket #4

Please let me know what you think about it. I would really appreciate if
you can include it into the official code.

Thanks a lot!
Jirka

On Thu, Aug 29, 2013 at 9:58 AM, Brice Goglin <Brice.Goglin_at_[hidden]> wrote:

> Le 28/08/2013 16:20, Jiri Hladky a écrit :
>
> I got your point. On the other hand I think that hwloc-distrib is at the
> moment not flexible enough to handle such case. I believe that the current
> strategy - start from the first object - is not the best one. From my
> experience, core 0 is always most used by the system so it seems that
> better strategy would to allocate the cores from the last one.
>
>
> Most people expect their jobs to be launched in order: process0 on first
> cores, process1 on next cores, etc.
>
> Again, you don't want to reverse the output, you want to ignore first
> core/socket if possible.
>
>
> I was looking at the source code of the hwloc-distrib and I believe
> that only this part of the code would be affected:
>
> for (i = 0; i < chunks; i++)
> roots[i] = hwloc_get_obj_by_depth(topology, from_depth, i); =>
> change this to roots[i] = hwloc_get_obj_by_depth(topology, from_depth,
> MAX_COUNT - i);
>
> hwloc_distributev(topology, roots, chunks, cpuset, n, to_depth); =>
> rewrite this to iterate in the reverse direction
>
>
> You can do the exact same thing by reversing your loop in the caller.
>
> Anyway, reversing the loop just move the core you don't want to the end of
> the list. But if you use the entire list, you end up using the exact same
> cores. You just reordered the processes among these cores.
>
>
> Am I missing something? In case of infinite bitmap hwloc-distrib will
> error out. This should solve the problems with hwloc_bitmap_singlify.
>
>
> We need a new singlify() above, one that doesn't use the first bit. That's
> what will make you use a core that is not the first one in each socket.
>
> Problems with infinite bitmaps are unrelated with hwloc-distrib, but they
> need to be handled in that new bitmap API anyway.
>
> Brice
>
>
> _______________________________________________
> hwloc-devel mailing list
> hwloc-devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel
>