Dear Jeff, thanks for the information.

>Open MPI currently has very limited cartesian support -- it actually doesn't remap anything.

I see, OpenMPI doesn't remap anything; this explains probably why my runtime of my simulation varies sometimes between 30% for the same setup.

>Would you have any interest in writing a partitioning algorithm for your needs within the context of a plugin?  I'd be happy to walk >you through the process; it's not too complicated (although we should probably move the discussion off to the Open MPI devel >mailing list).

I guess after using for more than a decade Open Source Software, it's time to give something back :). ... so yes, I am willing to do that !!

Because I am not yet experienced with OpenMPI internals, I would really appreciate your advice, if you could tell me where exactly I have to dig into.. I guess it should be around ompi_topo_create function, but how to write MPI_Cart_Create as a plugin, I will rely on you information. And do you know if MPICH, LAM etc. have an efficient implementation of MPI_Cart_Create ? so I can borrow some ideas from them....

best wishes,

Paul Hilscher

 

On Tue, Jun 29, 2010 at 8:17 PM, Jeff Squyres <jsquyres@cisco.com> wrote:
Open MPI currently has very limited cartesian support -- it actually doesn't remap anything.

That being said, it is *very* easy to extend Open MPI's algorithms for cartesian partitioning.  As you probably already know, Open MPI is all about its plugins -- finding and selecting a good set of plugins to use at run-time.  Open MPI has many different types of plugins.  One of these types of plugins performs the cartesian/graph mapping behind MPI_Cart_create (and friends) function(s).

Would you have any interest in writing a partitioning algorithm for your needs within the context of a plugin?  I'd be happy to walk you through the process; it's not too complicated (although we should probably move the discussion off to the Open MPI devel mailing list).


On Jun 29, 2010, at 4:50 AM, Paul Hilscher wrote:

> Dear OpenMPI list,
>
> I am using  a MPI-parallelized simulation program,  with a domain-decomposition in 6-Dimensions.
> In order to improve the scalability of my program I would like to know according to what preferences
> is MPI distributing the ranks when using MPI_Cart_create( reorder allowed).
>
> To explain my inquiry, imagine a 3-dimensional solver in  X-Y-M and 4 computing
> nodes, each nodes consist of 4 Quad-Core CPUs (4(Node)x[ 4(CPUs) x 4(Cores))] CPUs=64CPUs).
>
> Now I decompose  all 3 dimensions by 4 (4x4x4 = 64) using  MPI_Cart_create.
> MPI has now several  possibilities to map the problem e.g. X-M (locally) on a node and
> Y across the nodes, or Y-M (locally) and X across the nodes.
>
> Now my question is, how can I tell MPI that I want to distribute X-Y locally while
> M is distributed across nodes. The reason is that X-Y
> communication ratio is much large (FFT) compared to M where we have only
> 2 communications per time-step via an Allreduce.
> An MPI implementation for the BlueGENE for example has an option
> called mapfile where on can tell MPI how to map the dimensions onto
> the Nodes. I did not found somethings similar for openmpi.
>
> Does anybody know how to achieve this mapping or could anybody
> tell me where I could find some examples or tutorials ?
>
> Thank you very much for your help and best wishes
>
> Paul Hilscher
> _______________________________________________