Open MPI logo

PLPA Users' Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all PLPA Users mailing list

From: Bert Wesarg (wesarg_at_[hidden])
Date: 2007-04-24 03:51:53


Jeff Squyres wrote:
> On Apr 22, 2007, at 5:47 PM, Bert Wesarg wrote:
>
>>> I'm not sure what you mean -- can you give a concrete example?
>> I think that was mine thinking error. The current one to one mapping
>> between the processor id and the (socket, core) tuple is perfectly
>> fine.
>> What I mean with the NUMA node id, is to add the node id as an
>> attribute
>> to the processor_id. That can be queried by the api. But to support
>> SMT,
>> it would be necessary to extend it to a (socket,core,thread) tuple.
>
> I still don't understand this last statement.
Ok, if the current (socket,core) plpa code will ever hit a machine with
enabled SMT/HTT, or whatever its called today, the code will find multiple
(socket,core) tuples for one processor ID and so breaking the ont-to-one
mapping. Because a core have threads, and therefor multiple virtual
processor ids. But I can't prove it, because I don't have access to such a
machine or unless someone sends a cpu topology with enabled HTT/SMT (see
the Call for Help message).

>
>> I have implement the node attribute and changed/extended the API as
>> follow:
>>
>> /* node_id can be NULL */
>> int PLPA_NAME(map_to_processor_id)
>> (int socket, int core, int *processor_id, int *node_id);
>
> Taking a step back, it seems like given any one of the three
> following entities (bear with me -- I'm thinking out loud):
>
> - (socket,core) tuple
> - processor ID
> - node ID
>
> You want to be able to query for the other two. It seems like we
> should be able to have a very small number of functions to be able to
> query for all of these. But it's unfortunately a little more complex
> than this because:
>
> - one-to-one: (socket,core) -> processor ID
out of question

> - one-to-one: (socket,core) -> node ID
IMHO redundant and not necessary

> - one-to-one: processor ID -> (socket,core)
out of question

> - one-to-one: processor ID -> node ID
out of question

> - one-to-many: node ID -> (socket,core)
interesting, but
> - one-to-many: node ID -> processor ID
i find this better

The above statements results from the following point of view: the node ID
is only an attribute to the processor ID. The main advantage from this pov
is, it makes the API very clear and simple. There is only one main
mapping: (socket,core) <-> processor ID. And the node ID is just a needful
extra information.

>
> I see two possibilities:
>
> 1. Have 1 function and define a struct for input and output,
> something like this: ...never mind, I wrote it out and it was really
> ugly.
>
> 2. Have 3 functions (assuming we delete the 2 current plpa_map_to_*()
> functions):
>
> - have (socket,core), want processor ID and/or node ID
> plpa_map_from_socket_core(int socket, int core, int *processor_id,
> int *node_id);
> processor_id can be NULL or node_id can be NULL, but not both.
>
> - have processor ID, want (socket,core) and/or node ID
> plpa_map_from_processor(int processor_id, int *socket, int *core,
> int *node_id);
> (socket,core) can be NULL or node_id can be NULL, but not both.
>
> - have node ID, want a list of processor IDs and/or a list of
> (socket,core) tuples
> plpa_map_from_node(int node_id, int *socket, int *core, int
> *processor_id);
> (socket,core) can be NULL or processor_id can be NULL, but not both.
>
> How does this sound?
>
>> /* this name is not perfect, node_id is a attribute of processor_id */
>> int PLPA_NAME(map_to_node_id)(int processor_id, int *node_id);
>>
>> int PLPA_NAME(max_node_id)(int *max_node_id);
>>
>> The one think that is maybe missing is a function to query the
>> number of
>> sockets in a NUMA node.
>
> If you create a "non trivial" patch, I'll need to have an OMPI 3rd
> party contribution agreement from you. It's unfortunately necessary
> in today's legal software climate. :-(
Convinced. Is it necessary to send this mail as a certified/registered
mail or with an return receipt?

>
> But if you can describe to me the stuff I need to parse in /sys to
> get node mappings, I'm happy to add this functionality myself.
>

In devices/system/node/ are directories named "node%d", in these
directories are symbolic links to the "cpu%d" directories in
devices/system/cpu/, or use the cpumap in theses directories, its a
cpumask with the processor IDs in this node.
Thats all.

Bert