Open MPI logo

Hardware Locality Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Development mailing list

Subject: Re: [hwloc-devel] hwloc-distrib - please add the option to distribute the jobs in the reverse direction
From: Jiri Hladky (hladky.jiri_at_[hidden])
Date: 2013-08-28 10:39:36


MAX_COUNT is topology->level_nbobjects[from_depth] -1

On Wed, Aug 28, 2013 at 4:20 PM, Jiri Hladky <hladky.jiri_at_[hidden]> wrote:

>
> On Tue, Aug 27, 2013 at 7:57 PM, Brice Goglin <Brice.Goglin_at_[hidden]>wrote:
>>
>> You just explained why I don't like weights. Some people will want to
>> ignore L2, some won't. Specifying all this on the command-line would be
>> horrible, and implementing it will be horrible too.
>>
>
> :-) Agreed.
>
>
>>
>> > I think that --reverse option is much easier for the implementation
>> > and for the clear requirement and understanding how the output should
>> > look like.
>>
>> Implementing reverse bitmap_singlify() isn't so easy.
>>
>> Also "--reverse" would have a semantics that no users ever requested,
>> it's only a workaround for your actual need ("ignore core0 if
>> possible"). What if somebody laer comes with a machine where he wants to
>> preferably ignore core 7 and maybe ignore core 11 too, because some
>> special daemons are running there? We'd need to add
>> --dont-reverse-but-ignore-some-cores-if-possible. Or what if somebody
>> wants to ignore the first core but still get other cores in the normal
>> order?
>>
>
> I got your point. On the other hand I think that hwloc-distrib is at the
> moment not flexible enough to handle such case. I believe that the current
> strategy - start from the first object - is not the best one. From my
> experience, core 0 is always most used by the system so it seems that
> better strategy would to allocate the cores from the last one. So for
> example, when I say that I would like to avoid PU#0 then it means I would
> like in fact avoid Socket#0 as well as long as possible. The same applies
> to NUMANode#0.
>
> I was looking at the source code of the hwloc-distrib and I believe that
> only this part of the code would be affected:
>
> for (i = 0; i < chunks; i++)
> roots[i] = hwloc_get_obj_by_depth(topology, from_depth, i); =>
> change this to roots[i] = hwloc_get_obj_by_depth(topology, from_depth,
> MAX_COUNT - i);
>
> hwloc_distributev(topology, roots, chunks, cpuset, n, to_depth); =>
> rewrite this to iterate in the reverse direction
>
> MAX_COUNT seems to be known and accessible as topology->nb_levels.
>
> Am I missing something? In case of infinite bitmap hwloc-distrib will
> error out. This should solve the problems with hwloc_bitmap_singlify.
>
>
> I tend to think we should let the application handle these specific
>> cases (finding what can be ignored while still having enough objects,
>> and then calling distribute accordingly).
>>
>
> Actually I believe that this change is more easily implemented directly in
> the C code rather then using some work-around in Bash. And I believe that
> the use case is not such exotic. As outlined above, sarting from core#0 is
> not always the best strategy....
>
> Please let me know what do you think.
>
> Jirka
>