Wonderful!!! We've been waiting for such functionality for a while.
I do have some questions/remarks related to this patch.
What is the my_node_rank in the orte_proc_info_t structure? Is there any difference between using the field my_node_rank or the vpid part of the my_daemon? What is the correct way of finding that two processes are on the same remote location, comparing their daemon vpid or their node_rank? How the node_rank change with respect to dynamic process management when new daemons are joining?
The flag OPAL_PROC_ON_L*CACHE is only set for local processes if I understand correctly your last email?
I guess proc_flags in proc.h should be opal_paffinity_locality_t to match the flags on the ORTE level?
A more high level remark. The fact that the locality information is automatically packed and exchanged during the grpcomm modex call seems a little bit weird (do the upper level have a saying on it?). I would not have thought that the grpcomm (which based on the grpcomm.h header file is a framework providing communication services that span entire jobs or collections of processes) is the place to put it.
Thanks,
george.
On Oct 19, 2011, at 16:28 , Ralph Castain wrote:
> Hi folks
>
> For those of you who don't follow the commits...
>
> I just committed (r25323) an extension of the orte_ess.proc_get_locality function that allows a process to get its relative resource usage with any other proc in the job. In other words, you can provide a process name to the function, and the returned bitmask tells you if you share a node, numa, socket, caches (by level), core, and hyperthread with that process.
>
> If you are on the same node and unbound, of course, you share all of those. However, if you are bound, then this can help tell you if you are on a common numa node, sharing an L1 cache, etc. Might be handy.
>
> I implemented the underlying functionality so that we can further extend it to tell you the relative resource location of two procs on a remote node. If that someday becomes of interest, it would be relatively easy to do - but would require passing more info around. Hence, I've allowed for it, but not implemented it until there is some identified need.
>
> Locality info is available anytime after the modex is completed during MPI_Init, and is supported regardless of launch environment (minus cnos, for now), launch by mpirun, or direct-launch - in other words, pretty much always.
>
> Hope it proves of help in your work
> Ralph
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
|