I am using a MPI-parallelized simulation program, with a domain-decomposition in 6-Dimensions.

In order to improve the scalability of my program I would like to know according to what preferences

is MPI distributing the ranks when using MPI_Cart_create( reorder allowed).

To explain my inquiry, imagine a 3-dimensional solver in X-Y-M and 4 computing

nodes, each nodes consist of 4 Quad-Core CPUs (4(Node)x[ 4(CPUs) x 4(Cores))] CPUs=64CPUs).

Now I decompose all 3 dimensions by 4 (4x4x4 = 64) using MPI_Cart_create.

MPI has now several possibilities to map the problem e.g. X-M (locally) on a node and

Y across the nodes, or Y-M (locally) and X across the nodes.

Now my question is, how can I tell MPI that I want to distribute X-Y locally while

M is distributed across nodes. The reason is that X-Y

communication ratio is much large (FFT) compared to M where we have only

2 communications per time-step via an Allreduce.

An MPI implementation for the BlueGENE for example has an option

called mapfile where on can tell MPI how to map the dimensions onto

the Nodes. I did not found somethings similar for openmpi.

called mapfile where on can tell MPI how to map the dimensions onto

the Nodes. I did not found somethings similar for openmpi.

Does anybody know how to achieve this mapping or could anybody

tell me where I could find some examples or tutorials ?

Thank you very much for your help and best wishes

Paul Hilscher

tell me where I could find some examples or tutorials ?

Thank you very much for your help and best wishes

Paul Hilscher