Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] IMB-MPI broadcast test stalls for large core counts: debug ideas?
From: Rahul Nabar (rpnabar_at_[hidden])
Date: 2010-08-24 13:38:35

On Mon, Aug 23, 2010 at 8:39 PM, Randolph Pullen
<randolph_pullen_at_[hidden]> wrote:
> I have had a similar load related problem with Bcast.

Thanks Randolph! That's interesting to know! What was the hardware you
were using? Does your bcast fail at the exact same point too?

> I don't know what caused it though.  With this one, what about the possibility of a buffer overrun or network saturation?

How can I test for a buffer overrun?

For network saturation I guess I could use something like mrtg to
monitor the bandwidth used. On the other hand, all 32 servers are
connected to a single dedicated Nexus5000. The back-plane carries no
other traffic. Hence I am skeptical that just 41943040 saturated what
Cisco rates as a 10GigE fabric. But I might be wrong.