On 3/2/14 0:44 AM, Tru Huynh wrote:
> On Fri, Feb 28, 2014 at 08:49:45AM +0100, Bernd Dammann wrote:
>> Maybe I should say, that we moved from SL 6.1 and OMPI 1.4.x to SL
>> 6.4 with the above kernel, and OMPI 1.6.5 - which means a major
>> upgrade of our cluster.
>> After the upgrade, users reported those slowdowns, and a search on
>> this list showed, that other sites had the same (or similar issues)
>> with this kernel and OMPI version combination.
> afaik, 2.6.32-431 series is from RHEL(and clones) version >=6.5
You're right - the kernel is coming from the rolling release of SL.
> otoh, it might be related to http://bugs.centos.org/view.php?id=6949
Thanks!!! That was exactly the problem. We patched the kernel and
installed it on a few nodes, and so far testing looks promising. We had
the kernel scheduler on our radar, since we could see that there were
differences compared to the old kernel we'd used before, but didn't have
time to dig deeper into it, yet. Great work! Let's hope this patch
will make it into the official kernel.