Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Galen M. Shipman (gshipman_at_[hidden])
Date: 2006-02-05 13:38:41


Hi Konstantin,

> MPI_Alltoall_Isend_Irecv

This is a very unscalable algorithm in skampi as it simply posts N
MPI_Irecv's and MPI_Isend's and then does a Waitall. We shouldn't
have an issue though on 8 procs but in general I would expect the
performance of this algorithm to degrade quite quickly especially
compared to Open MPI's tuned collectives. I can dig into this a bit
more if you send me your .skampi file configured to run this
particular benchmark.

Thanks,

Galen

On Feb 4, 2006, at 9:37 AM, Konstantin Kudin wrote:

> Dear Jeff and Galen,
>
> I have tried openmpi-1.1a1r8890. The good news is that it seems like
> the freaky long latencies for certain packet sizes went away with the
> options they showed up with before. Also, one version of all-to-all
> appears to behave nicer with a specified set of parameters. However, I
> still get only 1cpu performance out of 8 with the actual application,
> and all this time is spent doing parallel FFTs. What is interesting is
> that even with the tuned parameters, the other version of all-to-all
> still performs quite poorly (see below).
>
> #/*@insyncol_MPI_Alltoall-nodes-long-SM.ski*/
> mpirun -np 8 -mca btl tcp -mca coll self,basic,tuned -mca \
> mpi_paffinity_alone 1 skampi41
> 2 272.1 3.7 8 272.1 3.7 8
> 3 1800.5 72.9 8 1800.5 72.9 8
> 4 3074.0 61.0 8 3074.0 61.0 8
> 5 5705.5 102.0 8 5705.5 102.0 8
> 6 8054.2 282.3 8 8054.2 282.3 8
> 7 9462.9 104.2 8 9462.9 104.2 8
> 8 11245.8 66.9 8 11245.8 66.9 8
>
> mpirun -np 8 -mca btl tcp -mca coll self,basic,tuned -mca \
> mpi_paffinity_alone 1 -mca coll_basic_crossover 8 skampi41
> 2 267.7 1.5 8 267.7 1.5 8
> 3 1591.2 8.4 8 1591.2 8.4 8
> 4 2704.4 17.1 8 2704.4 17.1 8
> 5 4813.7 307.9 3 4813.7 307.9 3
> 6 5329.1 57.0 2 5329.1 57.0 2
> 7 198767.6 49076.2 5 198767.6 49076.2 5
> 8 254832.6 11235.3 5 254832.6 11235.3 5
>
>
> Still poor performance:
>
> #/*@insyncol_MPI_Alltoall_Isend_Irecv-nodes-long-SM.ski*/
> 2 235.0 0.7 8 235.0 0.7 8
> 3 1565.6 15.3 8 1565.6 15.3 8
> 4 2694.8 24.3 8 2694.8 24.3 8
> 5 11389.9 6971.9 6 11389.9 6971.9 6
> 6 249612.0 21102.1 2 249612.0 21102.1 2
> 7 239051.9 3915.0 2 239051.9 3915.0 2
> 8 262356.5 12324.6 2 262356.5 12324.6 2
>
>
> Kostya
>
>
>
>
> --- Jeff Squyres <jsquyres_at_[hidden]> wrote:
>
>> Greetings Konstantin.
>>
>> Many thanks for this report. Another user submitted almost the same
>>
>> issue earlier today (poor performance of Open MPI 1.0.x collectives;
>>
>> see http://www.open-mpi.org/community/lists/users/2006/02/0558.php).
>>
>> Let me provide an additional clarification on Galen's reply:
>>
>> The collectives in Open MPI 1.0.x are known to be sub-optimal -- they
>>
>> return correct results, but they are not optimized at all. This is
>> what Galen meant by "If I use the basic collectives then things do
>> fall apart with long messages, but this is expected". The
>> collectives in the Open MPI 1.1.x series (i.e., our current
>> development trunk) provide *much* better performance.
>>
>> Galen ran his tests using the "tuned" collective module in the 1.1.x
>>
>> series -- these are the "better" collectives that I referred to
>> above. This "tuned" module does not exist in the 1.0.x series.
>>
>> You can download a 1.1.x nightly snapshot -- including the new
>> "tuned" module -- from here:
>>
>> http://www.open-mpi.org/nightly/trunk/
>>
>> If you get the opportunity, could you re-try your application with a
>>
>> 1.1 snapshot?
>
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam? Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users