Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Tuned Collective MCA params
From: kyron_at_[hidden]
Date: 2008-10-03 15:10:41

> Eric,
> In the 1.3 and some of the latest 1.2.X versions tuned is the default
> component for collectives. However, the tuned currently in the trunk
> are optimized for high performance networks (such as IB or MX), and
> they do not deliver the best performance on slower devices such as
> Ethernet.

I forgot to mention the version I am running is 1.2.7. Since I am running
Ethernet I know I can't expect miracles but I was at least wondering if I
could expect some performance gain by using Allgather compared to
Send/Recv, even givent that context.

> In order to play with the different implementation of allgather you
> should either on the $(HOME)/.openmpi/mca-params.conf or command line
> set the following MCA parameters:
> 1) coll_tuned_use_dynamic_rules to one in order to enable fine grain
> selection of the algorithms

The decription wasn't too clear about it's usage, thanks.

> 2) coll_tuned_allgather_algorithm to a value between 0 and 6 (read the
> output corresponding to this algorithm from 'ompi_info --param coll
> tuned' once you enabled the dynamic rules).

Since `ompi_info --param coll tuned|grep coll_tuned_allgather_algorithm`
returns null, I'll assume it's not part of 1.2.7. I'll dig into the code
to see what my options are, otherwise I'll be forced to install 1.3 ;)

> This will allow you to select a specific algorithm for the allgather.
> You can further tuned it, by playing with the fanout (in case of trees
> topologies), and with the segment size (for the pipelined ones).


> george.
> On Oct 3, 2008, at 8:48 AM, Eric Thibodeau wrote:
>> Hello all,
>> I am currently profiling a simple case where I replace multiple S/
>> R calls with Allgather calls and it would _seem_ the simple S/R
>> calls are faster. Now, *before* I come to any conclusion on this,
>> one of the pieces I am missing is more details on how /if/when the
>> tuned coll MCA is selected. In other words, can I assume the tuned
>> versions are used by default? I skimmed through the well documented
>> source code but before I can even start to analyze the replacement's
>> impact (in a small cluster), I need to know how and when the tuned
>> coll MCA is used/selected.
>> Thanks,
>> Eric