I discovered "dplace" today. I don't know how many people install/use it
on their cluster, but it's something that looks interesting when you
don't have advanced binding capabilities in the MPI implementation. For
instance, you could do:
$ mpirun -np 8 dplace 0,4,2,6,1,5,3,7 myprogram
to bind process ranks according to the machine topology.
hwloc-calc can easily generate such list of physical processors, for
$ hwloc-calc --physical proc:all --pulist
or even restrict of one PU per socket with:
$ hwloc-calc --physical socket:all.core:0 --pulist
So hwloc-calc could help dplace significantly. Maybe we should put such
examples somewhere in the doc.