Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2006-10-17 10:20:11


Collective performance is something that we are definitely working
on. Some of the collectives have had a fair amount of tuning done
(the "tuned" coll component), but

a) not all of them are done
b) they were originally tuned for TCP networks

Note that b) is somewhat misleading; the "tuned" coll component has
bunches of different algorithms for each collective operation that it
supports. The difficult part is deciding which algorithm to use in
each scenario. Factors such as message size, number of processes
involved, and network latency and bandwidth are all factors in
deciding which to use. The first round of "decision functions" (the
run-time entity that decides which algorithm to use) assumed GB TCP
networks.

Work is ongoing at U. Tennessee to incorporate decision functions for
other networks (infiniband and myrinet); we hope to include these in
v1.2.

On Oct 16, 2006, at 11:23 AM, Scott Weitzenkamp ((sweitzen)) wrote:

> I see this too, and had filed
> http://openib.org/bugzilla/show_bug.cgi?id=188 for it. I think Jeff
> opened a bug in the Open MPI bug tracker for this, he can comment on
> when it is scheduled to be fixed.
>
> Scott Weitzenkamp
> SQA and Release Manager
> Server Virtualization Business Unit
> Cisco Systems
>
>
>> -----Original Message-----
>> From: users-bounces_at_[hidden]
>> [mailto:users-bounces_at_[hidden]] On Behalf Of Maestas,
>> Christopher Daniel
>> Sent: Monday, October 16, 2006 7:00 AM
>> To: users_at_[hidden]
>> Subject: [OMPI users] Question on mpi collectives
>>
>> How fast/well are MPI collectives implemented in ompi?
>> I'm running the Intel MPI 1.1. benchmarks and seeing the need to set
>> wall clock times > 12 hours for run sizes of 200 and 300
>> nodes for 1ppn
>> and 2ppn cases. The collective tests that usually pass in 2ppn
>> cases:
>> Barrier, Reduce scatter, allreduce, bcast
>>
>> The ones that take long or never run:
>> Allgather, alltoall, allgatherv
>>
>> Thanks,
>> -cdm
>>
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems