Good catch; fixed.
On Apr 23, 2012, at 4:40 PM, George Bosilca wrote:
> No strong opinion. However, the comment about the initial value of opal_cache_line_size is wrong (opal/runtime/opal.h), as it states that the default value is -1 while it is 128.
> On Apr 23, 2012, at 16:21 , Jeffrey Squyres wrote:
>> No one replied to this RFC. Does anyone have an opinion about it?
>> I have attached a patch (including some debugging output) showing my initial implementation. If no one objects by the end of this week, I'll commit to the trunk.
>> Terry: please add this to the agenda tomorrow.
>> On Mar 30, 2012, at 1:09 PM, Jeffrey Squyres wrote:
>>> I was just recently reminded of a comment that is near the top of opal_init_util():
>>> /* JMS See note in runtime/opal.h -- this is temporary; to be
>>> replaced with real hwloc information soon (in trunk/v1.5 and
>>> beyond, only). This *used* to be a #define, so it's important
>>> to define it very early. */
>>> opal_cache_line_size = 128;
>>> A few points:
>>> 1. On my platforms, hwloc tells me that my cache line size is 64, not 128. Probably not a tragedy, but...
>>> 2. I see opal_cache_line_size being used in a lot of BTL and PML initialization locations. I see it being used in opal/class/free_list.*, too.
>>> 3. I poked around with this yesterday to see if we could have hwloc initialize the opal_cache_line_size value. Points to remember:
>>> - we initialize the opal hwloc framework in opal_init(), but we do not load the local machine's architecture then (because it can be expensive, particularly if lots of MPI processes are all doing it simultaneously)
>>> - instead, the local machine topology is discovered once by each orted (using hwloc) and then RML sent to each local MPI process, where it is locally loaded into each MPI proc's hwloc tree
>>> - this happens during the orte_init() in ompi_mpi_init()
>>> Meaning: we can initialize the opal_cache_line_size in MPI processes during orte_init().
>>> Is this acceptable to everyone?
>>> If so, I can go ahead and code this up. I would probably leave the initial value hard-coded to 128 (just in case something uses it before orte_init()), and then later during orte_init(), reset it to the smallest L1 cache size that hwloc finds on the machine.
>>> Jeff Squyres
>>> For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
>>> devel mailing list
>> Jeff Squyres
>> For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
>> devel mailing list
> devel mailing list
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/