You might want to try the OMPI tarball that is about to become OMPI v1.6.1 -- we made a bunch of affinity-related fixes, and it should be much more predictable / stable in what it does in terms of process binding:
(these affinity fixes are not yet in a nightly 1.6 tarball because we're testing them before they get committed to the OMPI v1.6 SVN branch)
On May 30, 2012, at 9:54 AM, Brice Goglin wrote:
> Hello Youri,
> When using openmpi 1.4.4 with --np 2 --bind-to-core --bycore it reports the following:
>> [hostname:03339] [[17125,0],0] odls:default:fork binding child [[17125,1],0] to cpus 0001
>> [hostname:03339] [[17125,0],0] odls:default:fork binding child [[17125,1],1] to cpus 0002
> Bitmask 0001 and 0002 mean CPUs with physical indexes 0 and 1 in OMPI 1.4. So that corresponds to the first core of each socket, and that matches what hwloc-ps says. Try "hwloc-ps -c" should show the same bitmask.
> However, I agree that these are not adjacent cores, but I don't know enough of OMPI binding options to understand what it was supposed to do in your case.
> hwloc-users mailing list
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/