Open MPI logo

Hardware Locality Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Hardware Locality Development mailing list

Subject: Re: [hwloc-devel] hwloc powerpc rhel5 and power7 patch
From: Alexey Kardashevskiy (aik_at_[hidden])
Date: 2010-09-17 06:01:46

On 17/09/10 00:57, Samuel Thibault wrote:
> Alexey Kardashevskiy, le Thu 16 Sep 2010 15:57:47 +1000, a écrit :
>>>> - where do I put IBM-specific code?
>>> Is the device tree linux-specific ? If so, it can stay in linux file as
>>> long as it's not 30k lines :) We already have both sysfs and
>>> /proc/cpuinfo code there anyway.
>> It is powerpc-specific. It is mapped from the system firmware (aka bios)
>> by the powerpc kernel. However it is just a folder within /proc so it is
>> usual linux folder. But PowerPC might be not the only architecture which
>> uses the same pathname for the same thing.
> AIUI from google searches, it is an Open Firmware standard, also
> used on the OLPC, and thus would be not only for PowerPC. Since
> that's not really Linux-specific, it would take place somewhere
> else than topology-linux.c, but the standard doesn't seem to say
> how/where this tree is exposed. Apparently there's even some sort
> of text format for it, see arch/powerpc/boot/dts/*.dts. I'd thus
> say that all OS-independant code should go to a topology-of.c file,
> and topology-linux.c would provide generic tree functions to browse
> /proc/device-tree, i.e. of_get_str, of_get_int_arr, and generic tree
> browsing functions, which topology-of.c can then use to browse the
> tree and fetch information without knowing where it is taken from. A
> topology-dts.c file would provide the same functions, but this time
> providing data from a .dts file, and that combination could be a backend
> replacing the /proc and /sys parsing completely, similarly to loading a
> .xml file. Note that I'm not asking you to do all this, I'm just asking
> to rework the function interfaces a little bit to have things already
> cleanly separated for anybody who would feel like adding another OS
> support or parsing .dts files some day, I believe that shouldn't be too
> much work for you.

Regarding topology walking - there is actually nothing device-tree
special in reading strings and numbers from a device-tree, it is just
common functions which (I think) should be placed in utils/misc.c. I
named functions like hwloc_read_str.

Regarding open firmware and device trees - yes, it is IEEE1275 which can
be implemented anywhere. However, in our case, there are IBM-specific
properties (ibm,phandle, ibm,ppc-interrupt-server#s) which make all this
very IBM specific. I do not know how to deal with it better.

I reworked the patch today, it is attached. It is made against today's
SVN tree.

> > From what I can read in drivers/of/fdt.c, the binary values are passed
> directly from the table in memory. I wonder which endian order this is
> supposed to be, but since I can see be32_to_cpu all around in the code
> there, I guess it is assumed to be all big endian. There are thus also a
> few fixes to be done in your code:
> - Make sure you have proper sizes: code like
> unsigned int reg = -1;
> if (0 != of_get_int_arr(cpu, "reg",&reg, sizeof(reg), root_fd))
> is not really safe, you should use uint32_t to make sure of the size of
> the integer.
> - Once the read is done, values need to be swapped on little-endian
> machines. You could use ntohl for that.

The hwlow code is compiled on powerpc and reads device-tree on a powerpc
machine. If endianess does not match in this case, we have got a big
problem, this should not happen. Or I am missing something.
The point regarding uint32_t vs. (unsigned int) is taken.

> - Constant-sized buffers are never a good idea. I know there are already
> some in topology-linux, but that's not a reason when it should be easy
> to make things properly :)

Fixed :)

> It's thus say that you should rather provide the following functions:
> static inline int
> of_get_int(const char *p, const char *p1, uint32_t *value, int root_fd);
> Gets one integer into `value'.

done :)

> static inline uint32_t *
> of_get_int_arr(const char *p, const char *p1, int root_fd);
> Returns the array of integers. Yes, this makes a lot of
> allocation/deallocation, but that should be neglectible compared to the
> system calls and will save us nasty length-bugs later.

How many integers were read by this function? What allocator was used
for the returned value? Sure I can presume it was malloc, but still it
is not a good style :)

Regarding the latest patch about numa nodes numbers - do I understand
correctly that hwloc_cpuset_t is just a bit array which can be used
everywhere where we need a bit array, not just for CPUs? :-)