Jeff Squyres <jsquyres_at_[hidden]> wrote:
> I *believe* that this has to do with physical setup within the
> machine (i.e., the NIC/HCA bus is physically "closer" to some
> sockets), but I'm not much of a hardware guy to know that for sure.
> Someone with more specific knowledge should chime in here...
On NUMA architectures, most common being Opteron, the South Bridge is
connected through an HT link to one CPU on one socket. Which socket
depends on the motherboard, but it should be described in the
motherboard documentation (it's not always socket 0). If a process on
the other socket needs to write something to a NIC on a PCIE bus behind
the South Bridge, it needs to first hop through the first socket. This
hop cost usually something like 100ns, ie 0.1 us. If the socket is
further away, like in a 4 or 8-socket configuration, there would
potentially be more hops.
However, having the processes getting bumped from one socket to another
is more expensive in terms of cache locality (with all of the cache
coherency overhead that comes with the lack of it) than it terms of HT
Non-NUMA architectures like Intel Woodcrest have a flat access time to
the South Bridge, but cache locality is still important so CPU affinity
is always a good thing to do.