Open MPI logo

Hardware Locality Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Development mailing list

Subject: Re: [hwloc-devel] hwloc-distrib --among
From: Jirka Hladky (jhladky_at_[hidden])
Date: 2010-11-18 09:13:46


Hi Samuel,

thanks for looking into it! I'm using hwloc_distribute to distribute parallel
jobs on multi-socket systems.

Usually, it gives nice results: running
hwloc-distrib --single <N>
on box with <N> sockets will ditrbitute one job per socket. This is what I
want.

hwloc-distrib --single <2*N>
will distribute 2 jobs per socket, picking-up PU wisely.

It breaks however on strange systems. Please check with
lstopo --input
or hwloc-distrib --input
on topology I sent you with my last e-mail (hp-dl980g7-01.tar.bz2, sent on
Tuesday 09:30:37 pm)

This box has a broken NUMA topology - there are 7 sockets in one NUMA node and
1 socket in another NUMA node.

My goal is to distribute one job per Socket with command
hwloc-distrib --single 8

This is not working. So I have tried various --among and -ignore options to
achieve this but without success.

Please try
hwloc-distrib --input hp-dl980g7-01 --single 8
with data I sent you on Tuesday (tar jxvf hp-dl980g7-01.tar.bz2). Goal is to
distribute one job per one socket.

Thanks!
Jirka

On Tuesday, November 16, 2010 10:20:38 pm Samuel Thibault wrote:
> Samuel Thibault, le Tue 16 Nov 2010 22:18:54 +0100, a écrit :
> > Also note that currently the hwloc_distribute() function doesn't take
> > e.g. the number of PUs into account when splitting elements over the
> > hierarchy. It was more a demonstration example than something to be used
> > as is. We can however extend it, we just need to know what's desired.
>
> Reading your mail again, I guess that's where your issue actually lied.
>
> Samuel
> _______________________________________________
> hwloc-devel mailing list
> hwloc-devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel