> In the 1.3 and some of the latest 1.2.X versions tuned is the default
> component for collectives. However, the tuned currently in the trunk
> are optimized for high performance networks (such as IB or MX), and
> they do not deliver the best performance on slower devices such as
I forgot to mention the version I am running is 1.2.7. Since I am running
Ethernet I know I can't expect miracles but I was at least wondering if I
could expect some performance gain by using Allgather compared to
Send/Recv, even givent that context.
> In order to play with the different implementation of allgather you
> should either on the $(HOME)/.openmpi/mca-params.conf or command line
> set the following MCA parameters:
> 1) coll_tuned_use_dynamic_rules to one in order to enable fine grain
> selection of the algorithms
The decription wasn't too clear about it's usage, thanks.
> 2) coll_tuned_allgather_algorithm to a value between 0 and 6 (read the
> output corresponding to this algorithm from 'ompi_info --param coll
> tuned' once you enabled the dynamic rules).
Since `ompi_info --param coll tuned|grep coll_tuned_allgather_algorithm`
returns null, I'll assume it's not part of 1.2.7. I'll dig into the code
to see what my options are, otherwise I'll be forced to install 1.3 ;)
> This will allow you to select a specific algorithm for the allgather.
> You can further tuned it, by playing with the fanout (in case of trees
> topologies), and with the segment size (for the pipelined ones).
> On Oct 3, 2008, at 8:48 AM, Eric Thibodeau wrote:
>> Hello all,
>> I am currently profiling a simple case where I replace multiple S/
>> R calls with Allgather calls and it would _seem_ the simple S/R
>> calls are faster. Now, *before* I come to any conclusion on this,
>> one of the pieces I am missing is more details on how /if/when the
>> tuned coll MCA is selected. In other words, can I assume the tuned
>> versions are used by default? I skimmed through the well documented
>> source code but before I can even start to analyze the replacement's
>> impact (in a small cluster), I need to know how and when the tuned
>> coll MCA is used/selected.