Open MPI logo

Hardware Locality Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Development mailing list

Subject: Re: [hwloc-devel] hwloc-distrib - please add the option to distribute the jobs in the reverse direction
From: Jiri Hladky (hladky.jiri_at_[hidden])
Date: 2013-08-29 09:36:56


Hi Brice,
Hi Samuel,

it seems I have done poor job explaining how I'm using hwloc-distrib. Let
me repair it.

On 128 core system for example, we do run series of parallel jobs:

1 job
2 jobs
4 jobs
8 jobs
12 jobs
and so on upto 128 jobs.

Parallel jobs are synchronized via semaphores and we measure the total
runtime for each serie and watch how linux job scheduler perform. We do run
jobs using
* no restrictions at all
* bound to a CPU via taskset
* bound to a NUMA node via numactl

We compare the results again each other and also against different versions
of Linux kernel. We use hwloc-distrib to distribute the jobs the best
possible way for the taskset command. The idea is that Linux scheduler
should distribute the jobs to get the same performance as achieved
by hwloc-distrib & taskset

So we run series of hwloc-distrib commands

hwloc-distrib --single --taskset 1
hwloc-distrib --single --taskset 2
hwloc-distrib --single --taskset 4
hwloc-distrib --single --taskset 8

 and so on. We do always use the full output of hwloc-distrib command to
start the jobs via taskset. Right now, on a 8 socket server we are getting
this output

hwloc-distrib --single --taskset 1 => Socket0, core 0
hwloc-distrib --single --taskset 2 =>Socket0, core 0 & Socket1, core 0
hwloc-distrib --single --taskset 4 => Socket0, core 0 & Socket1, core 0 &
Socket3, core 0, & Socket4, core 0
hwloc-distrib --single --taskset 8 => Socket0, core 0 & Socket1, core 0 &
Socket3, core 0, & Socket4, core 0 & Socket5, core 0 & Socket6, core 0 &
Socket7, core 0, & Socket8, core 0

This is not optimal since core#0 is always the one used by OS at most. With
proposed --reverse option I expect to get this output:
hwloc-distrib --single --taskset 1 => Socket7, core 7
hwloc-distrib --single --taskset 2 => Socket7, core 7 & Socket6, core 7
hwloc-distrib --single --taskset 4 => Socket7, core 7 & Socket6, core 7 &
Socket5, core 7, & Socket4, core 7

I do not care about the order in which hwloc-distrib sorts the results. For
example, the two possible outputs of hwloc-distrib --single --taskset 2
Socket7, core 7 & Socket6, core 7
and
Socket6, core 7 & Socket7, core 7

are equival for me.

What do I need is that hwloc-distrib starts from the last Socket and last
core in that Socket when distributing the jobs. Right now it starts from
Socket0, core0.

I have attached the /proc/interrupts for that server. It has 8 sockets,
each socket has 8 physical cores, 16 PUs with HT. Please see the peaks for
interrupts for CORE 0, 8, 16, 24, 32, 40, 48,56. It corresponds to CORE#0
in each Socket. Please be sure to turn off the line wrapping when
inspecting that file.

Hopefully you got the point. Please let me know if you have questions.

What do think about this? Does it make sense to you?

Thanks!
Jirka

On Thu, Aug 29, 2013 at 10:20 AM, Samuel Thibault
<samuel.thibault_at_[hidden]>wrote:

> Brice Goglin, le Thu 29 Aug 2013 09:58:17 +0200, a écrit :
> > Anyway, reversing the loop just move the core you don't want to the end
> of the
> > list. But if you use the entire list, you end up using the exact same
> cores.
>
> He wants that, yes.
>
> Samuel
> _______________________________________________
> hwloc-devel mailing list
> hwloc-devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel
>