Open MPI logo

Hardware Locality Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Development mailing list

Subject: Re: [hwloc-devel] towards PLPA-like API in 1.0
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2009-12-10 16:57:56


On Dec 10, 2009, at 2:08 AM, Brice Goglin wrote:

> >> 1) get_obj_under_by_type(topology, type, index, subtype, subindex)
> >> returns for instance core 2 under socket 3. It's very easy
> >> (get_obj_by_type+get_obj_inside_cpuset_by_type).
> >> 2) Some people might want _under_under with 3 types/indexes. Not sure we
> >> want it, or want to make it generic with arrays of types/indexes...
> >> 3) Generic conversion routines between os_index and logical_index, like
> >> get_obj_by_os_index(type, os_index) and get_os_index_by_type(type, index)
> >> 4) Some kind of processor flag which tells us whether a physical proc
> >> exists and is online
> >
> > Any opinion about this? Should we drop the current plpa.h and just add
> > the above new inlines to helper.h? (with some documentation about
> > switching from PLPA into these new functions)
>
> Since nobody commented

Sorry. :-( Yes, I agree removing plpa.h is good :-) and new helper functions are good.

> and Jeff has already removed the PLPA tests from
> trunk, I am going to add (1) and probably (3), and document (2) and (4)
> in the PLPA doc section. Then I'll move most comments from plpa.h into
> this doc section and remove plpa.h entirely.

How about having a v-like interface like you mentioned in #2? (analogous to writev, etc. -- takes an array)

Did we settle the whole OS/physical vs. logical numbering issues?

I think we decided that all CLI tools will report/accept logical numbering by default, but also accept --physical to switch to OS/physical numbering. Are you saying that the API will be all OS/physical, with conversion functions from #3 to convert to/from logical? Seems a little weird that the default would be opposite between the CLI and the API...? (I could be misunderstanding you...)

Additionally, what exactly is the logical ordering defined to be? We need to guarantee that it is the same across every run, and across reboots. I.e., I see that topology-linux.c uses diropen() and readdir() to read entries from /sys. Do we sort the data somehow before putting them into data structures (so that the logical ordering is the same every time), or is the order defined by readdir()? If it's defined by readdir(), then the order is effectively random each time. Although the order is *unlikely to change*, it still *could*. I think we can't do anything if the OS decides to change its ID for a given device, but we should be able to have a stable logical ordering even if readdir() returns a different order on successive runs.

My point: if we're going to have a logical ordering, we should be able to provide at least some level of guarantee of stability about that logical ordering.

Make sense?

-- 
Jeff Squyres
jsquyres_at_[hidden]