I actually am only working on this code because it is what someone wants. I would have probably chosen a different solver as well. We do have some vector machines though that this will probably run on.
I did a lot of memory work already in the serial code to speed it up pretty significantly. I also have to a little careful of what other libraries are introduced. I am reading up on PETSc to see how flexible it is.
Thanks for the responses
From: Jed Brown [mailto:five9a2_at_[hidden]] On Behalf Of Jed Brown
Sent: Thursday, March 11, 2010 1:00 PM
To: Cole, Derek E; users_at_[hidden]
Subject: Re: [OMPI users] 3D domain decomposition with MPI
On Thu, 11 Mar 2010 12:44:01 -0500, "Cole, Derek E" <derek.e.cole_at_[hidden]> wrote:
> I am replying to this via the daily-digest message I got. Sorry it
> wasn't sooner... I didn't realize I was getting replies until I got
> the digest. Does anyone know how to change it so I get the emails as
> you all send them?
Log in at the bottom and edit options:
> I am doing a Red-black Gauss-Seidel algorithm.
Note that red-black Guss-Seidel is a terrible algorithm on cache-based hardware, it only makes sense on vector hardware. The reason for this is that the whole point is to approximate a dense action (the inverse of a matrix), but the red-black ordering causes this action to be purely local. A sequential ordering, on the other hand, is like a dense lower-triangular operation, which tends to be much closer to a real inverse. In parallel, you do these sequential sweeps on each process, and communicate when you're done. The memory performance will be twice as good, and the algorithm will converge in fewer iterations.
> The ghost points are what I was trying to figure for moving this into
> the 3rd dimension. Thanks for adding some concrete-ness to my idea of
> exactly how much overhead is involved. The test domains I am computing
> on are on the order of 100*50*50 or so. This is why I am trying to
> limit the overhead of the MPI communication. I am in the process of
> finding out exactly how big the domains may become, so that I can
> adjust the algorithm accordingly.
The difficulty is for small subdomains. If you have large subdomains, then parallel overhead will almost always be small.
> If I understand what you mean by pencils versus books, I don't think
> that will work for these types of calculations will it? Maybe a better
> explanation of what you mean by a pencil versus a book. If I was going
> to solve a sub-matrix of the XY plane for all Z-values, what is that
That would be a "book" or "slab".
I still recommend using PETSc rather than reproducing standard code to call MPI directly for this, it will handle all the decomposition and updates, and has the advantage that you'll be able to use much better algorithms than Gauss-Seidel.