Open MPI logo

Hardware Locality Users' Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Users mailing list

Subject: Re: [hwloc-users] Many queries creating slow performance
From: Jeff Squyres (jsquyres) (jsquyres_at_[hidden])
Date: 2013-03-05 15:28:13


FWIW, we do this in Open MPI: one process on each server does the lstopo (via C API, of course). That information is then exported to all other processes via XML, so that only 1 process per server walks the /sys trees, etc.

On Mar 5, 2013, at 3:25 PM, Brice Goglin <Brice.Goglin_at_[hidden]> wrote:

> Hello Simon,
>
> I don't think anybody every benchmarked this, but people have been complaining this problem appearing on large machines at some point. I have a large SGI machine at work, I'll see if I can reproduce this.
>
> One solution is to export the topology to XML once and then have all your MPI process read from XML. Basically, do "lstopo /tmp/foo.xml" and then export HWLOC_XMLFILE=/tmp/foo.xml in the environment before starting your MPI job.
>
> If the topology doesn't change (and that's likely the case), the XML file could even be stored by the administrator in a "standard" location (not in /tmp)
>
> Brice
>
>
>
> Le 05/03/2013 20:23, Simon Hammond a écrit :
>> Hi HWLOC users,
>>
>> We are seeing some significant performance problems using HWLOC 1.6.2 on Intel's MIC products. In one of our configurations we create 56 MPI ranks, each rank then queries the topology of the MIC card before creating threads. We are noticing that if we run 56 MPI ranks as opposed to one the calls to query the topology in HWLOC are very slow, runtime goes from seconds to minutes (and upwards).
>>
>> We guessed that this might be caused by the kernel serializing access to the /proc filesystem but this is just a hunch.
>>
>> Has anyone had this problem and found an easy way to change the library / calls to HWLOC so that the slow down is not experienced? Would you describe this as a bug?
>>
>> Thanks for your help.
>>
>>
>> --
>> Simon Hammond
>>
>> 1-(505)-845-7897 / MS-1319
>> Scalable Computer Architectures
>> Sandia National Laboratories, NM
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> hwloc-users mailing list
>>
>> hwloc-users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>
> _______________________________________________
> hwloc-users mailing list
> hwloc-users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/