Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] very bad parallel scaling of vasp using openmpi
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2009-08-17 21:23:55

You might want to run some performance testing of you TCP stacks and
the switch -- use a non-MPI application such as NetPIPE (or others --
google around) and see what kind of throughput you get. Try it
between individual server peers and then try running it simultaneously
between a bunch of peers and see if the results are different, etc.

On Aug 17, 2009, at 5:51 PM, Craig Plaisance wrote:

> Hi - I have compiled vasp 4.6.34 using the Intel fortran compiler 11.1
> with openmpi 1.3.3 on a cluster of 104 nodes running Rocks 5.2 with
> two
> quad core opterons connected by a Gbit ethernet. Running in
> parallel on
> one node (8 cores) runs very well, faster than any other cluster I
> have
> run it on. However, running on 2 nodes in parallel only improves the
> performance by 10% over the one node case while running on 4 and 8
> nodes
> yields no improvement over the two node case. Furthermore, when
> running
> multiple (3-4) jobs simultaneously, the performance decreases by
> around
> 50% compared to running only a single job on the entire cluster. The
> nodes are connected by a Dell Powerconnect 6248 managed switch. I get
> the same performance with mpich2, so I don't think it is a problem
> specific to openmpi. Other vasp users have reported very good scaling
> up to 4 nodes on a similar cluster, so I don't think the problem is
> vasp
> either. Could something be wrong with the way mpi is configured to
> work
> with the switch? Or the operating system is not configured to work
> with
> the switch properly? Or the switch itself needs to be configured?
> Thanks!
> _______________________________________________
> users mailing list
> users_at_[hidden]

Jeff Squyres