Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] btl/openib: get_ib_dev_distance doesn't see processes as bound if the job has been launched by srun
From: Brice Goglin (Brice.Goglin_at_[hidden])
Date: 2012-02-09 16:09:20


That doesn't really work with the hwloc model unfortunately. Also, when
you get to smaller objects (cores, threads, ...) there are multiple
"closest" objects at each depth.

We have one "closest" object at some depth (usually Machine or NUMA
node). If you need something higher, you just walk the parent links. If
you need something smaller, you look at children.

Also, each I/O device isn't directly attached to such a closest object.
It's usually attached under some bridge objects. There's a tree of hwloc
PCI bus objects exactly like you have a tree of hwloc
sockets/cores/threads/etc. At the top of the I/O tree, one (bridge)
object is attached to a regular object as explained earlier. So, when
you have a random hwloc PCI object, you get its locality by walking up
its parent link until you find a non-I/O object (one whose cpuset isn't
NULL). hwloc/helper.h gives you hwloc_get_non_io_ancestor_obj() to do that.

Brice

Le 09/02/2012 14:34, Ralph Castain a écrit :
> Ah, okay - in that case, having the I/O device attached to the
> "closest" object at each depth would be ideal from an OMPI perspective.
>
> On Feb 9, 2012, at 6:30 AM, Brice Goglin wrote:
>
>> The bios usually tells you which numa location is close to each
>> host-to-pci bridge. So the answer is yes.
>> Brice
>>
>>
>> Ralph Castain <rhc_at_[hidden] <mailto:rhc_at_[hidden]>> a écrit :
>>
>> I'm not sure I understand this comment. A PCI device is attached
>> to the node, not to any specific location within the node, isn't
>> it? Can you really say that a PCI device is "attached" to a
>> specific NUMA location, for example?
>>
>>
>> On Feb 9, 2012, at 6:15 AM, Jeff Squyres wrote:
>>
>>> That doesn't seem too attractive from an OMPI perspective,
>>> though. We'd want to know where the PCI devices are actually
>>> rooted.
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden] <mailto:devel_at_[hidden]>
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel