Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] 3D domain decomposition with MPI
From: Gus Correa (gus_at_[hidden])
Date: 2010-03-10 18:50:37

Hi Derek

PS - The same book "MPI the complete reference" has a thorough
description of MPI types in Chapter 3.
You may want to create and use a MPI_TYPE_VECTOR with the
appropriate count, blocklength, and stride, to exchange all the
"0..Z" overlap slices in a single swoop.
(If I understood right, this is your main concern.)

However, in my experience, huge messages
(i.e. tens or hundreds of megabytes) can be a problem.
If your array overlap section set is huge,
it may be better to send the slices one by one in a loop.

Also, I wonder why you want to decompose on both X and Y ("pencils"),
and not only X ("books"),
which may give you a smaller/simpler domain decomposition
and communication footprint.
Whether you can or cannot do this way depends on your
computation, which I don't know about.

I hope this helps,
Gus Correa
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA

Gus Correa wrote:
> Hi Derek
> Typically in the domain decomposition codes we have here
> (atmosphere, oceans, climate)
> there is an overlap across the boundaries of subdomains.
> Unless your computation is so "embarrassingly parallel" that
> each process can operate from start to end totally independent from
> the others, you should expect such an overlap,
> but you didn't tell what computation you want to do.
> The width of the overlap depends on the computation being done.
> For instance, in a two-point stencil finite difference PDE solver
> the overlap may have width one, but for broader FD stencils you
> will need broader overlaps.
> The redundant calculations of overlap points on neighbor subdomains
> in general cannot be avoided.
> Exchanging the overlap data across neighbor subdomain processes
> cannot be avoided either.
> However, **full overlap slices** are exchanged after each computational
> step (in our case here a time step).
> It is not a point-by-point exchange as you suggested.
> Overlap exchange does limit the usefulness/efficiency
> of using too many subdomains (e.g. if your overlap-to-useful-data
> ratio gets close to 100%).
> However, is not as detrimental as you imagined based on your
> point-by-point exchange conjecture.
> If your domain is 100x100x100 and you split in subdomain slices
> across 5 processes, with a 1-point overlap (on each side)
> you will have a 2x5/100 = 10% waste due to overlap calculations
> (plus the MPI communication cost/time),
> but your problem is still being solved in (almost) 1/5 of the time
> it would take in serial mode.
> Since your array seems to fit nicely in Cartesian coordinates,
> you could use the MPI functions that create and explore
> the Cartesian domain topology.
> For details, see Chapter 6, section 6.5 of "MPI, The complete Reference,
> Volume 1, The MPI Core, 2nd. Ed.,
> by M. Snir, S. Otto, S. Huss-Lederman, D. Walker, and J. Dongarra,
> MIT Press, 1998."
> Also, this tutorial from Indiana University solves the 2D diffusion
> equation (first serial, then parallel with MPI) and may help.
> Unfortunately they don't use the MPI Cartesian functions, though:
> I believe there are other examples in the web,
> check the LLNL site:
> The book
> "Parallel Programming with MPI, by Peter Pacheco,
> Morgan Kauffman, 1997" has worked out examples also.
> An abridged version is available here
> I hope this helps,
> Gus Correa
> ---------------------------------------------------------------------
> Gustavo Correa
> Lamont-Doherty Earth Observatory - Columbia University
> Palisades, NY, 10964-8000 - USA
> ---------------------------------------------------------------------
> Cole, Derek E wrote:
>> Hi all. I am relatively new to MPI, and so this may be covered
>> somewhere else, but I can’t seem to find any links to tutorials
>> mentioning any specifics, so perhaps someone here can help.
>> In C, I have a 3D array that I have dynamically allocated and access
>> like Array[x][y][z]. I was hoping to calculate a subsection for each
>> processor to work on, of size nx in the x dimension, ny in the y
>> dimension, and the full Z dimension. Starting at Array[sx][sy][0] and
>> going to Array[ex][ey][z] where ey-sy=ny.
>> What is the best way to do this? I am able to calculate the
>> neighboring processors and assign a sub-section of the XY dimensions
>> to each processor, however I am having problems with sharing the
>> border information of the arrays with the other processors. I don’t
>> really want to have to do a MPI_Send for each of the 0..Z slices’s
>> border information. I’d kind of like to process all of the Z, then
>> share the full “face” of the border information with the neighbor
>> processor. For example, if process 1 was the right neighbor of process
>> zero, I’d want process zero to send Subarray[0..nx][ny][0..Z](the
>> right most face) to processor 1’s left-most face..assuming the X-Y
>> plane was your screen, and the Z dimension extended into the screen.
>> If anyone has any information that talks about how to use the MPI data
>> types, or some other method, or wants to talk about how this might be
>> done, I’m all ears.
>> I know it is hard to talk about without pictures, so if you all like,
>> I can post a picture explaning what I want to do. Thanks!
>> Derek
>> ------------------------------------------------------------------------
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
> _______________________________________________
> users mailing list
> users_at_[hidden]