Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] SM btl slows down bandwidth?
From: Gus Correa (gus_at_[hidden])
Date: 2008-08-12 19:53:19


Hello Daniel and list

Could it be a problem with memory bandwidth / contention in multi-core?
It has been reported in many mailing lists (mpich, beowulf, etc).
Here it seems to happen in dual-processor dual-core with our memory
intensive programs.

Have you checked what happens to the shared memory runs as you
you increase the number of active cores/processes?
Would it help to set the processor affinity in the shared memory runs?

http://www.open-mpi.org/faq/?category=building#build-paffinity
http://www.open-mpi.org/faq/?category=tuning#using-paffinity

Gus Correa

-- 
---------------------------------------------------------------------
Gustavo J. Ponce Correa, PhD - Email: gus_at_[hidden]
Lamont-Doherty Earth Observatory - Columbia University
P.O. Box 1000 [61 Route 9W] - Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------
Daniël Mantione wrote:
>Hello,
>
>I'm troubleshooting a weird benchmark situation that having the sm btl 
>enabled gives me worse results than disabling it.
>
>For example, this on a single compute node with 2*Xeon5420, 8 GB RAM and a 
>ConnectX gen2 IB card, with OFED 1.3 and OpenMPI 1.2.6 as software setup:
>
>[cvsupport_at_extern src]$ mpirun -np 8 --mca btl self,sm,openib -hostfile \
>hostfile ./IMB-MPI1.openmpi -npmin 8 PingPong
>
>#---------------------------------------------------
># Benchmarking PingPong
># #processes = 2
># ( 6 additional processes waiting in MPI_Barrier)
>#---------------------------------------------------
>       #bytes #repetitions      t[usec]   Mbytes/sec
>            0         1000         0.87         0.00
>            1         1000         0.98         0.97
>            2         1000         0.97         1.96
>            4         1000         0.99         3.87
>            8         1000         0.98         7.78
>           16         1000         1.15        13.33
>           32         1000         1.13        26.93
>           64         1000         1.12        54.42
>          128         1000         1.27        96.31
>          256         1000         1.55       157.01
>          512         1000         2.04       239.00
>         1024         1000         2.75       355.62
>         2048         1000         4.58       426.40
>         4096         1000         7.12       548.93
>         8192         1000        11.29       692.14
>        16384         1000        18.83       829.75
>        32768         1000        34.57       904.08
>        65536          640        60.73      1029.22
>       131072          320       112.06      1115.43
>       262144          160       215.48      1160.21
>       524288           80       423.34      1181.09
>      1048576           40       858.18      1165.26
>      2097152           20      1744.15      1146.69
>      4194304           10      4055.60       986.29
>
>Now, when disabling the sm btl, the score is:
>
>#---------------------------------------------------
># Benchmarking PingPong
># #processes = 2
># ( 6 additional processes waiting in MPI_Barrier)
>#---------------------------------------------------
>       #bytes #repetitions      t[usec]   Mbytes/sec
>            0         1000         1.08         0.00
>            1         1000         1.42         0.67
>            2         1000         1.19         1.60
>            4         1000         1.21         3.14
>            8         1000         1.61         4.75
>           16         1000         1.30        11.70
>           32         1000         1.32        23.13
>           64         1000         1.61        37.97
>          128         1000         2.80        43.53
>          256         1000         3.21        76.05
>          512         1000         4.06       120.15
>         1024         1000         5.03       194.21
>         2048         1000         7.15       273.05
>         4096         1000        10.05       388.55
>         8192         1000        16.02       487.76
>        16384         1000        29.63       527.41
>        32768         1000        51.23       610.03
>        65536          640        92.26       677.43
>       131072          320       141.03       886.36
>       262144          160       233.62      1070.14
>       524288           80       434.56      1150.60
>      1048576           40       818.84      1221.24
>      2097152           20      1403.75      1424.76
>      4194304           10      2523.40      1585.16
>
>
>Now, I do have fast Infiniband, but I can't believe that the openib btl is 
>supposed to be faster than the sm btl. Does anyone know wether 
>something can be tuned here?
>
>Best regards,
>
>Daniël Mantione
>
>------------------------------------------------------------------------
>
>_______________________________________________
>users mailing list
>users_at_[hidden]
>http://www.open-mpi.org/mailman/listinfo.cgi/users
>