No. Each hwloc-bind command in the mpirun above doesn't know that there are other hwloc-bind instances on the same machine. All of them bind their process to all cores in the first socket.

=> Agree. For socket:0.core:0-3 , hwloc will bind the MPI processes to all cores in the first socket. But how are the individual processes mapped on these cores? Will it be in this order:

rank 0 à core 0

rank 1 à core 1

rank 2 à core 2

rank 3 à core 3

Or in this *arbitrary* order:

rank 0 à core 1

rank 1 à core 3

rank 2 à core 0

rank 3 à core 2

The operating system decides where each process runs (according to the binding). It usually has no knowledge of MPI ranks. And I don't think it looks at the PID numbers during the scheduling. So it's very likely random.

