Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] parallelising ADI
From: Jed Brown (jed_at_[hidden])
Date: 2012-03-06 17:50:33


On Tue, Mar 6, 2012 at 16:23, Tim Prince <n8tm_at_[hidden]> wrote:

> On 03/06/2012 03:59 PM, Kharche, Sanjay wrote:
>
>> Hi
>>
>> I am working on a 3D ADI solver for the heat equation. I have implemented
>> it as serial. Would anybody be able to indicate the best and more
>> straightforward way to parallelise it. Apologies if this is going to the
>> wrong forum.
>>
>>
>> If it's to be implemented in parallelizable fashion (not SSOR style
> where each line uses updates from the previous line), it should be feasible
> to divide the outer loop into an appropriate number of blocks, or decompose
> the physical domain and perform ADI on individual blocks, then update and
> repeat.

True ADI has inherently high communication cost because a lot of data
movement is needed to make the _fundamentally sequential_ tridiagonal
solves local enough that latency doesn't kill you trying to keep those
solves distributed. This also applies (albeit to a lesser degree) in serial
due to way memory works.

If you only do non-overlapping subdomain solves, you must use a Krylov
method just to ensure convergence. You can add overlap, but the Krylov
method is still necessary for any practical convergence rate. The method
will also require an iteration count proportional to the number of
subdomains across the global domain times the square root of the number of
elements across a subdomain. The constants may not be small and this
asymptotic result is independent of what the subdomain solver is. You need
a coarse level to improve this scaling.

Sanjay, as Matt and I recommended when you asked the same question on the
PETSc list this morning, unless this is a homework assignment, you should
solve your problem with multigrid instead of ADI. We pointed you to simple
example code that scales well to many thousands of processes.