Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] openmpi src rpm and message coalesce
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2009-04-15 07:14:24

On Apr 10, 2009, at 9:54 AM, vkm wrote:

> I was trying to understand how "btl_openib_use_message_coalescing"
> is working.

Heh. It's ugly. :-)

It's purely a benchmark optimization; there are very few (if any) real-
world apps that will benefit from this feature. I freely admit that
we were pressured by marketing types to put in this feature (despite
resisting this feature for a year or two). Basically, if you're
sending the same exact message to the same MPI peer repeatedly, and if
you run out of networking buffers (e.g., you're waiting for the
current set of messages to drain before any more network buffers will
become available), if you notice that the last message on the queue is
exactly the same as your message, then you can just increment a
counter on the last message. This effectively means that when you
send that last message, you are effectively sending N (where N == the
counter) messages in that one fragment. The receiver knows/
understands this optimization and will match N posted MPI receives
against that one incoming message.

This is a bunch of logic that was added that benefits benchmarks but
not real apps. Yuck. :-(

> Since for a certain test scenario, IMB-EXT is working if I use
> "btl_openib_use_message_coalescing = 0" and not for
> "btl_openib_use_message_coalescing = 1"
> No idea, who can have BUG here either open-mpi or low-level-
> driver !! ??

Could this be related to

> Howsoever, I have one more concern as well. I added some prints to
> debug openmpi.
> I was following below procedure,
> Extract OFED TAR
> Extract openmpi*.src.rpm
> Go to SOURCE
> Extract openmpi*.tgz
> modify code
> Create TAR
> Create openmpi*.src.rpm
> Build rpm

It is probably a whole lot simpler / faster to just get a source
tarball from and build / install it manually (rather
than create a new RPM every time). Particularly if you're adding
printf's in Open MPI components -- you can just "make install"
directly from the component directory (which will compile and install
just that plugin -- not all of OMPI).

Note, too, that you might want to use "opal_output(0, "printf-like
string with %d, %s, ...etc.", ...printf-like varargs....)" for
debugging output instead of printf.

Jeff Squyres
Cisco Systems