Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] collective algorithms
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2008-12-02 08:59:48

On Nov 25, 2008, at 10:29 AM, Максим Чусовлянов wrote:

> Hello! How i can integrated my collective communication algorithm in
> openMPI with MCA?

Sorry for the delay in answering -- SC08 and the US holiday last week
got in the way and I'm way behind on answering the mails in my INBOX.

Just to make sure we're talking about the same thing -- you have a new
collective algorithm for one of the MPI collective functions, and you
want to include that code in Open MPI so that it can be invoked by
MPI_<foo> in MPI applications, right?

If so, the right way to do this is to build a new Open MPI
"coll" (collective) component containing the code for your new
algorithm. Our coll components are basically a few housekeeping
functions and a bunch of function pointers for the functions to call
that are the back-ends to MPI collective functions (i.e., MPI_Bcast
and friends).

All the "coll" component code is under the ompi/mca/coll/ directory.
The "base" directory is some "glue" code for the coll framework itself
-- it's not a component. But all other directories are standalone
components that have corresponding dynamic shared objects (DSOs)
installed under $pkglibdir (typically $prefix/lib/openmpi).

You can build a component inside or outside of the Open MPI tree. If
you build outside of the Open MPI tree, you need to configure OMPI
with --with-devel-headers, which will install all of OMPI's internal
headers under $prefix. That way, you can -I these headers when you
compile your component. Just install your DSO in $pkglibdir; if all
goes well, "ompi_info | grep coll" should show your component.

If you build inside of the Open MPI tree, you need to make your
component dir under ompi/mca/coll/ and include a configure.params file
(look at ompi/mca/coll/basic/configure.params for a simple example)
and a (see ompi/mca/coll/basic/ for an
example). Then run the "" script that is at the top of the
tree and then run configure. You should see your component listed in
both the and configure output; configure should not that it
plans to build that component. When you finish configure, build and
install Open MPI. "ompi_info | grep coll" should show your component.

But I'm getting ahead of myself... Let's go back a few steps...

When building inside the OMPI tree, if you need to check for various
things to determine if you can build the component (i.e., some tests
during configure, such as checking for various hardware support
libraries), you can also add a configure.m4 file in your component's
directory. This gets a little tricky if you're not familiar with
Autoconf; let me know if you need some guidance here.

Now you can add the source code to the component. We have 2 important
abstractions that you need to know about:

- component: there is only one component instance in an MPI process.
It has global state.
- module: in the coll framework, there is one module instance for
every communicator that uses this component. It has local state
relevant to that specific communicator.

Think of "component" as a C++ class, and "module" as a C++ object.

Now read the comments in ompi/mca/coll/coll.h. This file contains the
struct interfaces for both the coll component and module. We
basically do everything by function pointer; the component returns a
set of function pointers and each module returns a struct of function
pointers. These function pointers are invoked by libmpi at various
times for various functions; see coll.h for a description of each.

During coll module initialization (i.e., when a new communicator has
been created), there's a process called "selection" where OMPI
determines which coll modules will be used on this communicator.
Modules can include/exclude themselves from the selection process.
For example, your algorithm may only be suitable for
intracommunicators. So if the communicator in question that is being
created is an intercommunicator, you probably want to exclude your
module from selection. Or if your algorithm can only handle powers-of-
two MPI processes, it should exclude itself if there is a non-power-of-
two number of processes in the communicator. And so on.

We designed coll modules in OMPI v1.3 to be "mix-n-match"-able such
that in a single communicator, you can use the broadcast function from
one module, but the gather function from a different module. Hence,
multiple coll modules may be active on a single communicator. In your
case, you'll need to make sure that your function has a higher
priority than the "tuned" coll component (which is the default in many

I'd suggest working in the Open MPI v1.3 tree, as we're going to
release this version soon and all future work is being done here (vs.
the v1.2 tree, which will eventually be deprecated).

Hopefully this is enough information to get you going. Please feel
free to ask more questions! But you might want to post followup
questions to the devel list; these aren't really user-level questions.

Good luck!

Jeff Squyres
Cisco Systems