Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Konstantin Kudin (konstantin_kudin_at_[hidden])
Date: 2006-02-02 19:10:40


 Hi all,

 There seem to have been problems with the attachement. Here is the
report:

 I did some tests of Open-MPI version 1.0.2a4r8848. My motivation was
an extreme degradation of all-to-all MPI performance on 8 cpus (ran
like 1 cpu). At the same time, MPICH 1.2.7 on 8 cpus runs more like on
4 (not like 1 !!!).

 This was done using Skampi from :
http://liinwww.ira.uka.de/~skampi/skampi4.1.tar.gz

 The version 4.1 was used.

 The system is bunch of a dual Opterons connected by Gigabit.

 The MPI operation I am most interested in is all-to-all exchange.

 First of all, there seem to be some problems with the logarithmic
approach. Here is what I mean. In the following, first column is the
packet size, the next one is the average time (microseconds), then
goes standard deviation. The test was done on 8 cpus (4 dual nodes).

>mpirun -np 8 -mca mpi_paffinity_alone 1 skampi41
#/*@inp2p_MPI_Send-MPI_Iprobe_Recv.ski*/
#Description of the MPI_Send-MPI_Iprobe_Recv measurement:
       0 74.3 1.3 8 74.3 1.3 8
      16 77.4 2.1 8 77.4 2.1 8 0.0
     0.0
      32 398.9 323.4 100 398.9 323.4 100 0.0
     0.0
      64 80.7 2.3 9 80.7 2.3 9 0.0
     0.0
      80 79.3 2.3 13 79.3 2.3 13 0.0
     0.0

>mpirun -np 8 -mca mpi_paffinity_alone 1 -mca coll_basic_crossover 8
skampi41
#/*@inp2p_MPI_Send-MPI_Iprobe_Recv.ski*/
#Description of the MPI_Send-MPI_Iprobe_Recv measurement:
       0 76.7 2.1 8 76.7 2.1 8
      16 75.8 1.5 8 75.8 1.5 8 0.0
     0.0
      32 74.4 0.6 8 74.4 0.6 8 0.0
     0.0
      64 76.3 0.4 8 76.3 0.4 8 0.0
     0.0
      80 76.7 0.5 8 76.7 0.5 8 0.0
     0.0

 This anomalously large times for certain packet sizes (either 16 or
32) without increasing coll_basic_crossover to 8 show up for a whole
set of tests, so this is not a fluke.

 Next, the all-to-all thing. The short test included 64x4 byte
messages.
The long one had 16384x4 byte messages.

> mpirun -np 8 -mca mpi_paffinity_alone 1 -mca coll_basic_crossover 8
skampi41
#/*@insyncol_MPI_Alltoall-nodes-short-SM.ski*/
       2 12.7 0.2 8 12.7 0.2 8
       3 56.1 0.3 8 56.1 0.3 8
       4 69.9 1.8 8 69.9 1.8 8
       5 87.0 2.2 8 87.0 2.2 8
       6 99.7 1.5 8 99.7 1.5 8
       7 122.5 2.2 8 122.5 2.2 8
       8 147.5 2.5 8 147.5 2.5 8

#/*@insyncol_MPI_Alltoall-nodes-long-SM.ski*/
       2 188.5 0.3 8 188.5 0.3 8
       3 1680.5 16.6 8 1680.5 16.6 8
       4 2759.0 15.5 8 2759.0 15.5 8
       5 4110.2 34.0 8 4110.2 34.0 8
       6 75443.5 44383.9 6 75443.5 44383.9 6
       7 242133.4 870.5 2 242133.4 870.5 2
       8 252436.7 4016.8 8 252436.7 4016.8 8

> mpirun -np 8 -mca mpi_paffinity_alone 1 -mca coll_basic_crossover 8
\
-mca coll_sm_info_num_procs 8 -mca btl_tcp_sndbuf 8388608 -mca
btl_tcp_rcvbuf 8388608 skampi41
#/*@insyncol_MPI_Alltoall-nodes-short-SM.ski*/
       2 13.1 0.1 8 13.1 0.1 8
       3 57.4 0.3 8 57.4 0.3 8
       4 73.7 1.6 8 73.7 1.6 8
       5 87.1 2.0 8 87.1 2.0 8
       6 103.7 2.0 8 103.7 2.0 8
       7 118.3 2.4 8 118.3 2.4 8
       8 146.7 3.1 8 146.7 3.1 8

#/*@insyncol_MPI_Alltoall-nodes-long-SM.ski*/
       2 185.8 0.6 8 185.8 0.6 8
       3 1760.4 17.3 8 1760.4 17.3 8
       4 2916.8 52.1 8 2916.8 52.1 8
       5 106993.4 102562.4 2 106993.4 102562.4 2
       6 260723.1 6679.1 2 260723.1 6679.1 2
       7 240225.2 6369.8 6 240225.2 6369.8 6
       8 250848.1 4863.2 6 250848.1 4863.2 6

> mpirun -np 8 -mca mpi_paffinity_alone 1 -mca coll_basic_crossover 8
\
-mca coll_sm_info_num_procs 8 -mca btl_tcp_sndbuf 8388608 \
-mca btl_tcp_rcvbuf 8388608 -mca btl_tcp_min_send_size 32768 \
-mca btl_tcp_max_send_size 65536 skampi41
#/*@insyncol_MPI_Alltoall-nodes-short-SM.ski*/
       2 13.5 0.2 8 13.5 0.2 8
       3 57.3 1.8 8 57.3 1.8 8
       4 68.8 0.5 8 68.8 0.5 8
       5 83.2 0.6 8 83.2 0.6 8
       6 102.9 1.8 8 102.9 1.8 8
       7 117.4 2.3 8 117.4 2.3 8
       8 149.3 2.1 8 149.3 2.1 8

#/*@insyncol_MPI_Alltoall-nodes-long-SM.ski*/
       2 187.5 0.5 8 187.5 0.5 8
       3 1661.1 33.4 8 1661.1 33.4 8
       4 2715.9 6.9 8 2715.9 6.9 8
       5 116805.2 43036.4 8 116805.2 43036.4 8
       6 163177.7 41363.4 7 163177.7 41363.4 7
       7 233105.5 20621.4 2 233105.5 20621.4 2
       8 332049.5 83860.5 2 332049.5 83860.5 2

Same stuff for MPICH 1.2.7 (sockets, no shared memory):
#/*@insyncol_MPI_Alltoall-nodes-short-SM.ski*/
       2 312.5 106.5 100 312.5 106.5 100
       3 546.9 136.2 100 546.9 136.2 100
       4 2929.7 195.3 100 2929.7 195.3 100
       5 2070.3 203.7 100 2070.3 203.7 100
       6 2929.7 170.0 100 2929.7 170.0 100
       7 1328.1 186.0 100 1328.1 186.0 100
       8 3203.1 244.4 100 3203.1 244.4 100

#/*@insyncol_MPI_Alltoall-nodes-short-SM.ski*/
       2 390.6 117.8 100 390.6 117.8 100
       3 3164.1 252.6 100 3164.1 252.6 100
       4 5859.4 196.3 100 5859.4 196.3 100
       5 15234.4 6895.1 30 15234.4 6895.1 30
       6 18136.2 5563.7 14 18136.2 5563.7 14
       7 14204.5 2898.0 11 14204.5 2898.0 11
       8 11718.8 1594.7 4 11718.8 1594.7 4

So, as one can see, MPICH latencies are much higher for small packets,
yet, things are way more consistent for larger ones. Depending on the
settings, Open-MPI either degrades at 5 or 6 cpus.

 Konstantin

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com