Hey Jeff,

It's not in OpenMPI or MPICH :(. It's a custom library which is not MPI aware making it difficult to share the topology query. Ill see if we can get a stand alone piece of code.

From earlier posts it sounds like OpenMPI queries once per physical node so probably won't have this problem. I'm guessing MPICH would do something similar?

S.



Sent with Good (www.good.com)


-----Original Message-----
From: Jeff Hammond [jhammond@alcf.anl.gov]
Sent: Tuesday, March 05, 2013 07:17 PM Mountain Standard Time
To: Hardware locality user list
Subject: [EXTERNAL] Re: [hwloc-users] Many queries creating slow performance

Si - Is your code that calls hwloc part of MPICH or OpenMPI or
something that can be made standalone and shared?

Brice - Do you have access to a MIC system for testing?  Write me
offline if you don't and I'll see what I can do to help.

If this affects MPICH i.e. Hydra, then I'm sure Intel will be
committed to helping fix it since Intel MPI is using Hydra as the
launcher on systems like Stampede.

Best,

Jeff

On Tue, Mar 5, 2013 at 3:05 PM, Brice Goglin <Brice.Goglin@inria.fr> wrote:
> Just tested on a 96-core shared-memory machine. Running OpenMPI 1.6 mpiexec
> lstopo, here's the execution time (mpiexec launch time is 0.2-0.4s)
>
> 1 rank :  0.2s
> 8 ranks:  0.3-0.5s depending on binding (packed or scatter)
> 24ranks:  0.8-3.7s depending on binding
> 48ranks:  2.8-8.0s depending on binding
> 96ranks: 14.2s
>
> 96ranks from a single XML file: 0.4s (negligible against mpiexec launch
> time)
>
> Brice
>
>
>
> Le 05/03/2013 20:23, Simon Hammond a écrit :
>
> Hi HWLOC users,
>
> We are seeing some significant performance problems using HWLOC 1.6.2 on
> Intel's MIC products. In one of our configurations we create 56 MPI ranks,
> each rank then queries the topology of the MIC card before creating threads.
> We are noticing that if we run 56 MPI ranks as opposed to one the calls to
> query the topology in HWLOC are very slow, runtime goes from seconds to
> minutes (and upwards).
>
> We guessed that this might be caused by the kernel serializing access to the
> /proc filesystem but this is just a hunch.
>
> Has anyone had this problem and found an easy way to change the library /
> calls to HWLOC so that the slow down is not experienced? Would you describe
> this as a bug?
>
> Thanks for your help.
>
>
> --
> Simon Hammond
>
> 1-(505)-845-7897 / MS-1319
> Scalable Computer Architectures
> Sandia National Laboratories, NM
>
>
>
>
>
>
> _______________________________________________
> hwloc-users mailing list
> hwloc-users@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>
>
>
> _______________________________________________
> hwloc-users mailing list
> hwloc-users@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users



--
Jeff Hammond
Argonne Leadership Computing Facility
University of Chicago Computation Institute
jhammond@alcf.anl.gov / (630) 252-5381
http://www.linkedin.com/in/jeffhammond
https://wiki.alcf.anl.gov/parts/index.php/User:Jhammond

_______________________________________________
hwloc-users mailing list
hwloc-users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users