Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: [OMPI devel] Fwd: [Open MPI] #3108: Affinity still busted in v1.6
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2012-05-24 15:02:09


FYI.

I think I have fixes ready, but I am bummed that we didn't fix the whole paffinity mess properly in 1.6. :-(

Begin forwarded message:

> From: Open MPI <bugs_at_[hidden]>
> Subject: [Open MPI] #3108: Affinity still busted in v1.6
> Date: May 24, 2012 2:59:42 PM EDT
> Cc: <bugs_at_[hidden]>
>
> #3108: Affinity still busted in v1.6
> ---------------------+----------------------------
> Reporter: jsquyres | Owner: rhc
> Type: defect | Status: new
> Priority: major | Milestone: Open MPI 1.6.1
> Version: trunk | Keywords:
> ---------------------+----------------------------
> I found a system yesterday where affinity is still horribly broken in
> v1.6. bind-to-core and bind-to-socket effectively did completely
> incorrect things. Among other things, the system in question has
> effectively fairly random physical socket/core numbering. It's not
> uniform across all the cores in any given socket.
>
> I have a new bitbucket where I think I've fixed the problems, and will be
> reviewing the code with Ralph soon:
>
> https://bitbucket.org/jsquyres/ompi-affinity-again-v1.6
>
> There were actually three bugs (that I've found so far); there's a
> separate commit on that bitbucket for each. See the commit messages on
> each of them.
>
> Once this firms up a bit, I'll make a tarball and ask others in the
> community to test it (e.g., Oracle and IBM, which have traditionally been
> good at finding whacky paffinity bugs).
>
> Note that this ''only'' affects OMPI v1.6 -- the trunk has a wholly
> revamped affinity system and the entire paffintiy framework is gone
> (yay!).
>
> --
> Ticket URL: <https://svn.open-mpi.org/trac/ompi/ticket/3108>
> Open MPI <http://www.open-mpi.org/>
>

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/