Thank you very much for your input which makes my direction pretty clear now. Depending on the progress of my project, I may be adventurous to try the nightly tarball, or may wait until a stable version is released.
I appreciate the hard work of the OMPI team, and am look forward to a more flexible binding option in OMPI's future release.
--- On Mon, 2/14/11, Jeff Squyres <jsquyres_at_[hidden]> wrote:
> From: Jeff Squyres <jsquyres_at_[hidden]>
> Subject: Re: [hwloc-users] hwloc-ps output - how to verify process binding on the core level?
> To: "Hardware locality user list" <hwloc-users_at_[hidden]>
> Date: Monday, February 14, 2011, 8:53 AM
> On Feb 14, 2011, at 9:35 AM, Siew Yin
> Chan wrote:
> > 1. I tried Open MPI 1.5.1 before turning to
> hwloc-bind. Yep. Open MPI 1.5.1 does provide the --bycore
> and --bind-to-core option, but this option seems to bind
> processes to cores on my machine according to the *physical*
> FWIW, you might want to try one of the OMPI 1.5.2 nightly
> tarballs -- we switched the process affinity stuff to hwloc
> in 1.5.2 (the 1.5.1 stuff uses a different mechanism).
> > FYI, my testing environment and application imposes
> these requirements for optimum performance:
> > i. Different binaries optimized for heterogeneous
> machines. This necessitates MIMD, and can be done in
> OMPI using the -app option (providing an application context
> > ii. The application is communication-sensitive. Thus,
> fine-grained process mapping on *machines* and on *cores* is
> required to minimize inter-machine and inter-socket
> communication costs occurring on the network and on the
> system bus. Specifically, processes should be mapped onto
> successive cores of one socket before the next socket is
> considered, i.e., socket.0:core0-3, then socket.1:core0-3.
> In this case, the communication among neighboring rank 0-3
> will be confined to socket 0 without going through the
> system bus. Same for rank 4-7 on socket 1. As such, the
> order of the cores should follow the *logical* indexes.
> I think that OMPI 1.5.2 should do this for you -- rather
> than following and logical/physical ordering, it does what
> you describe: traverses successive cores on a socket before
> going to the next socket (which happens to correspond to
> hwloc's logical ordering, but that was not the intent).
> FWIW, we have a huge revamp of OMPI's affinity support on
> the mpirun command line that will offer much more flexible
> binding choices.
> > Initially, I tried combining the features of rankfile
> and appfile, e.g.,
> > $ cat rankfile8np4
> > rank 0=compute-0-8 slot=0:0
> > rank 1=compute-0-8 slot=0:1
> > rank 2=compute-0-8 slot=0:2
> > rank 3=compute-0-8 slot=0:3
> > $ cat rankfile9np4
> > rank 0=compute-0-9 slot=0:0
> > rank 1=compute-0-9 slot=0:1
> > rank 2=compute-0-9 slot=0:2
> > rank 3=compute-0-9 slot=0:3
> > $ cat my_appfile_rankfile
> > --host compute-0-8 -rf rankfile8np4 -np 4 ./test1
> > --host compute-0-9 -rf rankfile9np4 -np 4 ./test2
> > $ mpirun -app my_appfile_rankfile
> > but found out that only the rankfile stated on the
> first line took effect; the second was ignored completely.
> After some time of googling and trial and error, I decided
> to try an external binder, and this direction led me to
> > Maybe I should bring the issue of rankfile + appfile
> to the OMPI mailing list.
> I'd have to look at it more closely, but it's possible that
> we only allow one rankfile per job -- i.e., that the
> rankfile should specify all the procs in the job, not on a
> per-host basis. But perhaps we don't warn/error if
> multiple rankfiles are used; I would consider that a bug.
> Jeff Squyres
> For corporate legal information go to:
> hwloc-users mailing list
TV dinner still cooling?
Check out "Tonight's Picks" on Yahoo! TV.