Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] very bad parallel scaling of vasp using openmpi
From: Joe Landman (landman_at_[hidden])
Date: 2009-09-23 08:23:34


Rahul Nabar wrote:
> On Tue, Aug 18, 2009 at 5:28 PM, Gerry Creager <gerry.creager_at_[hidden]> wrote:
>> Most of that bandwidth is in marketing... Sorry, but it's not a high
>> performance switch.
>
> Well, how does one figure out what exactly is a "hih performance
> switch"? I've found this an exceedingly hard task. Like the OP posted
> the Dell 6248 is rated to give more than a fully subscribed backbone
> capacity. Nor I do not know any good third party test lab nor do I
> know any switch load testing benchmarks that'd take a switch through
> its paces.
>
> So, how does one go about selecting a good switch? "The most expensive
> the better" is somewhat a unsatisfying option!

There are several options.

1) research the switches, get the numbers, and then find/interview the
people who use it. See if it is as advertised.

2) hire a company to do the same for you, or more to the point, generate
a reasonable recommendation given your needs.

3) design a benchmark test, and try to run it against the switch. The
OSU tests from D. Panda could be used for switch testing as well as for
HBA testing, with some simple adjustments (Panda's focus is mostly upon
latency and bandwidth as a function of message size, you could change
message size, and measure bandwidth/throughput as a function of number
of workers).

3 is likely the easiest for you to do. 2 is likely what you should do
if you are designing a cluster and need expert (non-biased) opinion.

Unfortunately, as Gerry indicates, there are a great deal of what I call
marketing numbers out there. There isn't enough real data. Marketing
numbers seem good on the surface. Its when you use the product, you
discover the reality isn't as rosy.

We have found several good gigabit switches for HPC/MPI codes. A number
of our customers have started out with the least expensive switch
possible, and ran into backplane problems in the 20's of nodes, never
mind the hundred plus they needed to run on.

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics, Inc.
email: landman_at_[hidden]
web  : http://scalableinformatics.com
        http://scalableinformatics.com/jackrabbit
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615