Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] btl/openib: get_ib_dev_distance doesn't see processes as bound if the job has been launched by srun
From: Ralph Castain (rhc_at_[hidden])
Date: 2012-02-09 16:14:05


Hmmm….guess we'll have to play with it. Our need is to start with a core or some similar object, and quickly determine the closest IO device of a certain type. We wound up having to write "summarizer" code to parse the hwloc tree into a more OMPI-usable form, so we can always do that with the IO tree as well if necessary.

On Feb 9, 2012, at 2:09 PM, Brice Goglin wrote:

> That doesn't really work with the hwloc model unfortunately. Also, when you get to smaller objects (cores, threads, ...) there are multiple "closest" objects at each depth.
>
> We have one "closest" object at some depth (usually Machine or NUMA node). If you need something higher, you just walk the parent links. If you need something smaller, you look at children.
>
> Also, each I/O device isn't directly attached to such a closest object. It's usually attached under some bridge objects. There's a tree of hwloc PCI bus objects exactly like you have a tree of hwloc sockets/cores/threads/etc. At the top of the I/O tree, one (bridge) object is attached to a regular object as explained earlier. So, when you have a random hwloc PCI object, you get its locality by walking up its parent link until you find a non-I/O object (one whose cpuset isn't NULL). hwloc/helper.h gives you hwloc_get_non_io_ancestor_obj() to do that.
>
> Brice
>
>
>
> Le 09/02/2012 14:34, Ralph Castain a écrit :
>>
>> Ah, okay - in that case, having the I/O device attached to the "closest" object at each depth would be ideal from an OMPI perspective.
>>
>> On Feb 9, 2012, at 6:30 AM, Brice Goglin wrote:
>>
>>> The bios usually tells you which numa location is close to each host-to-pci bridge. So the answer is yes.
>>> Brice
>>>
>>>
>>> Ralph Castain <rhc_at_[hidden]> a écrit :
>>> I'm not sure I understand this comment. A PCI device is attached to the node, not to any specific location within the node, isn't it? Can you really say that a PCI device is "attached" to a specific NUMA location, for example?
>>>
>>>
>>> On Feb 9, 2012, at 6:15 AM, Jeff Squyres wrote:
>>>
>>>> That doesn't seem too attractive from an OMPI perspective, though. We'd want to know where the PCI devices are actually rooted.
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel