No. Each hwloc-bind command in the mpirun above doesn't know that there are other hwloc-bind instances on the same machine. All of them bind their process to all cores in the first socket.
=> Agree. For socket:0.core:0-3 , hwloc will bind the MPI processes to all cores in the first socket. But how are the individual processes mapped on these cores? Will it be in this order:
rank 0 à core 0
rank 1 à core 1
rank 2 à core 2
rank 3 à core 3
Or in this *arbitrary* order:
rank 0 à core 1
rank 1 à core 3
rank 2 à core 0
rank 3 à core 2