This web mail archive is frozen.
This page is part of a frozen web archive of this mailing list.
You can still navigate around this archive, but know that no new mails
have been added to it since July of 2016.
Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.
Collective performance is something that we are definitely working
on. Some of the collectives have had a fair amount of tuning done
(the "tuned" coll component), but
a) not all of them are done
b) they were originally tuned for TCP networks
Note that b) is somewhat misleading; the "tuned" coll component has
bunches of different algorithms for each collective operation that it
supports. The difficult part is deciding which algorithm to use in
each scenario. Factors such as message size, number of processes
involved, and network latency and bandwidth are all factors in
deciding which to use. The first round of "decision functions" (the
run-time entity that decides which algorithm to use) assumed GB TCP
Work is ongoing at U. Tennessee to incorporate decision functions for
other networks (infiniband and myrinet); we hope to include these in
On Oct 16, 2006, at 11:23 AM, Scott Weitzenkamp ((sweitzen)) wrote:
> I see this too, and had filed
> http://openib.org/bugzilla/show_bug.cgi?id=188 for it. I think Jeff
> opened a bug in the Open MPI bug tracker for this, he can comment on
> when it is scheduled to be fixed.
> Scott Weitzenkamp
> SQA and Release Manager
> Server Virtualization Business Unit
> Cisco Systems
>> -----Original Message-----
>> From: users-bounces_at_[hidden]
>> [mailto:users-bounces_at_[hidden]] On Behalf Of Maestas,
>> Christopher Daniel
>> Sent: Monday, October 16, 2006 7:00 AM
>> To: users_at_[hidden]
>> Subject: [OMPI users] Question on mpi collectives
>> How fast/well are MPI collectives implemented in ompi?
>> I'm running the Intel MPI 1.1. benchmarks and seeing the need to set
>> wall clock times > 12 hours for run sizes of 200 and 300
>> nodes for 1ppn
>> and 2ppn cases. The collective tests that usually pass in 2ppn
>> Barrier, Reduce scatter, allreduce, bcast
>> The ones that take long or never run:
>> Allgather, alltoall, allgatherv
>> users mailing list
> users mailing list
Server Virtualization Business Unit