Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Bug in openmpi 1.5.4 in paffinity
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2011-09-06 10:17:45


Brice --

Should I apply that patch to the OMPI 1.5 series, or should we do a hwloc 1.2.2 release? I.e., is this broken on all AMD/Magny-Cours machines?

Should I also do an emergency OMPI 1.5.x release with (essentially) just this fix? (OMPI 1.5.x currently contains hwloc 1.2.0)

On Sep 6, 2011, at 1:43 AM, Brice Goglin wrote:

> Le 05/09/2011 21:29, Brice Goglin a écrit :
>> Dear Ake,
>> Could you try the attached patch? It's not optimized, but it's probably
>> going in the right direction.
>> (and don't forget to remove the above comment-out if you tried it).
>
> Actually, now that I've seen your entire topology, I found out that the
> real fix is the attached patch. This is actually a Magny-Cours specific
> problem (having 2 NUMA nodes inside each socket is quite unusual). I've
> already committed this patch to hwloc trunk and backported to the v1.2
> branch. It could be applied in OMPI 1.5.5.
>
> The patch that I sent earlier is not needed as long as cgroups don't
> reduce the available memory (your cgroups don't). I'll fix this other
> bug properly soon.
>
> Brice
>
> <fix-cgroup-vs-magnycours.patch>_______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/