I was just recently reminded of a comment that is near the top of opal_init_util():
/* JMS See note in runtime/opal.h -- this is temporary; to be
replaced with real hwloc information soon (in trunk/v1.5 and
beyond, only). This *used* to be a #define, so it's important
to define it very early. */
opal_cache_line_size = 128;
A few points:
1. On my platforms, hwloc tells me that my cache line size is 64, not 128. Probably not a tragedy, but...
2. I see opal_cache_line_size being used in a lot of BTL and PML initialization locations. I see it being used in opal/class/free_list.*, too.
3. I poked around with this yesterday to see if we could have hwloc initialize the opal_cache_line_size value. Points to remember:
- we initialize the opal hwloc framework in opal_init(), but we do not load the local machine's architecture then (because it can be expensive, particularly if lots of MPI processes are all doing it simultaneously)
- instead, the local machine topology is discovered once by each orted (using hwloc) and then RML sent to each local MPI process, where it is locally loaded into each MPI proc's hwloc tree
- this happens during the orte_init() in ompi_mpi_init()
Meaning: we can initialize the opal_cache_line_size in MPI processes during orte_init().
Is this acceptable to everyone?
If so, I can go ahead and code this up. I would probably leave the initial value hard-coded to 128 (just in case something uses it before orte_init()), and then later during orte_init(), reset it to the smallest L1 cache size that hwloc finds on the machine.
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/