Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: [OMPI devel] Determining locality
From: Ralph Castain (rhc_at_[hidden])
Date: 2011-10-20 15:56:53


For those wishing to use the new locality functionality, here is a little (hopefully clearer) info on how to do it. A few clarifications first may help:

1. the locality is defined by the precise cpu set upon which a process is bound. If not bound, this obviously includes all the available cpus on the node where the process resides.

2. the locality value we return to you is a bitmask where each bit represents a specific layer of common usage between you (the proc in which the call to orte_ess.proc_get_locality is made) and the given process. In other words, if the "socket" bit is set, it means you and the process you specified are both bound to the same socket.

Important note: it does -not- mean that the other process is currently executing on the same socket as you are executing upon at this instant in time. It only means that the OS is allowing that process to use the same socket that you are allowed to use. As the process swaps in/out and moves around, it may or may not be co-located on the socket with you at any given instant.

We do not currently provide a way for a process to get the relative locality of two other remote processes. However, the infrastructure supports this, so we can add it if/when someone shows a use-case for it.

3. every process has locality info for all of its peers AND for any proc that connected to it via MPI connect/accept or comm_spawn (the info is included in the modex during the connect/accept procedure). This is true regardless of launch method, with the exception of cnos (which doesn't have a modex).

With that in mind, let's start with determining if a proc is on the same node. The only way to determine if two procs other than yourself are on the same node is to compare their daemon vpids:

if (orte_ess.proc_get_daemon(A) == orte_ess.proc_get_daemon(B)), then A and B are on the same node.

However, there are two ways to determine if another proc is on the same node as you. First, you can of course use the above method to determine if you share the same daemon:

if (orte_ess.proc_get_daemon(A) == ORTE_PROC_MY_DAEMON->vpid), then we are on the same node

Alternatively, you can use the proc locality since it contains a "node" bit:

if (OPAL_PROC_ON_LOCAL_NODE(orte_ess.proc_get_locality(A))), then the proc is on the same node as us.

Similarly, we can determine if another process shares a socket, NUMA node, or other hardware element with us by applying the corresponding OPAL_PROC_ON_xxx macro to the locality returned by calling orte_ess.proc_get_locality for that process.

HTH
Ralph