To expand slightly on Patrick's last comment:
> Cache prefetching is slightly
> more efficient on local socket, so closer to reader may be a bit better.
Ideally one polls from cache, but in the event that the line is evicted the
next poll after the eviction will pay a lower cost if the memory is near to
the reader.
-Paul
Patrick Geoffray wrote:
> Richard Graham wrote:
>> Yes - it is polling volatile memory, so has to load from memory on
>> every read.
>
> Actually, it will poll in cache, and only load from memory when the
> cache coherency protocol invalidates the cache line. Volatile semantic
> only prevents compiler optimizations.
>
> It does not matter much where the pages are (closer to reader or
> receiver) on NUMAs, as long as they are equally distributed among all
> sockets (ie the choice is consistent). Cache prefetching is slightly
> more efficient on local socket, so closer to reader may be a bit better.
>
> Patrick
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
--
Paul H. Hargrove PHHargrove_at_[hidden]
Future Technologies Group
HPC Research Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
|