Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Fwd: [OMPI bugs] [Open MPI] #1250: Performance problem on SM
From: George Bosilca (bosilca_at_[hidden])
Date: 2008-07-23 10:01:43


Can you try the HEAD with the mpi_yield_when_idle set to 0 please.

   Thanks,
     george.

On Jul 23, 2008, at 3:39 PM, Jeff Squyres wrote:

> Short version: I'm seeing a large performance drop between r18850
> and the SVN HEAD.
>
> Longer version:
>
> FWIW, I ran the tests on 3 versions on a woodcrest-class x86_64
> machine running RHEL4U4:
>
> * Trunk HEAD (r18997)
> * r18973 --> had to patch the cpu64* thingy in openib btl to get it
> to compile
> * r18850
>
> I ran both osu_latency and NetPIPE 3.7.1. In the r18997 and r18973,
> the latency for short sends over sm is *significantly* higher than
> that of r18850. Detailed results below.
>
> ================================================================
> r18997
>
> [6:27] svbu-mpi052:~/svn/ompi-tests/NetPIPE-3.7.1 % mpirun --mca
> mpi_paffinity_alone 1 -np 2 --mca btl sm,self NPmpi
> 0: svbu-mpi052
> 1: svbu-mpi052
> Now starting the main loop
> 0: 1 bytes 85423 times --> 8.23 Mbps in 0.93 usec
> 1: 2 bytes 107852 times --> 16.46 Mbps in 0.93 usec
> 2: 3 bytes 107874 times --> 24.65 Mbps in 0.93 usec
> 3: 4 bytes 71801 times --> 30.36 Mbps in 1.01 usec
> 4: 6 bytes 74610 times --> 45.27 Mbps in 1.01 usec
> 5: 8 bytes 49448 times --> 60.59 Mbps in 1.01 usec
> 6: 12 bytes 62044 times --> 90.72 Mbps in 1.01 usec
> 7: 13 bytes 41287 times --> 98.58 Mbps in 1.01 usec
> 8: 16 bytes 45872 times --> 120.81 Mbps in 1.01 usec
> 9: 19 bytes 55670 times --> 143.78 Mbps in 1.01 usec
> 10: 21 bytes 62644 times --> 156.63 Mbps in 1.02 usec
> 11: 24 bytes 65172 times --> 177.63 Mbps in 1.03 usec
> 12: 27 bytes 68714 times --> 187.21 Mbps in 1.10 usec
> 13: 29 bytes 40392 times --> 201.05 Mbps in 1.10 usec
> 14: 32 bytes 43868 times --> 220.92 Mbps in 1.11 usec
> 15: 35 bytes 48072 times --> 255.73 Mbps in 1.04 usec
> 16: 45 bytes 54725 times --> 308.90 Mbps in 1.11 usec
> 17: 48 bytes 59983 times --> 329.04 Mbps in 1.11 usec
> 18: 51 bytes 61772 times --> 348.53 Mbps in 1.12 usec
> 19: 61 bytes 35126 times --> 408.86 Mbps in 1.14 usec
> 20: 64 bytes 43206 times --> 453.67 Mbps in 1.08 usec
> 21: 67 bytes 47907 times --> 487.77 Mbps in 1.05 usec
> 22: 93 bytes 51271 times --> 561.32 Mbps in 1.26 usec
> 23: 96 bytes 52741 times --> 595.08 Mbps in 1.23 usec
> 24: 99 bytes 55012 times --> 617.64 Mbps in 1.22 usec
> 25: 125 bytes 29735 times --> 736.44 Mbps in 1.29 usec
> 26: 128 bytes 38301 times --> 779.33 Mbps in 1.25 usec
> 27: 131 bytes 40525 times --> 818.32 Mbps in 1.22 usec
> 28: 189 bytes 42501 times --> 1007.67 Mbps in 1.43 usec
> 29: 192 bytes 46588 times --> 1084.13 Mbps in 1.35 usec
> 30: 195 bytes 49725 times --> 1128.97 Mbps in 1.32 usec
> 31: 253 bytes 26462 times --> 1257.97 Mbps in 1.53 usec
> 32: 256 bytes 32457 times --> 1304.17 Mbps in 1.50 usec
> 33: 259 bytes 33647 times --> 1354.14 Mbps in 1.46 usec
> 34: 381 bytes 34925 times --> 1616.43 Mbps in 1.80 usec
> 35: 384 bytes 37072 times --> 1676.92 Mbps in 1.75 usec
> 36: 387 bytes 38308 times --> 1724.50 Mbps in 1.71 usec
> 37: 509 bytes 19921 times --> 1908.30 Mbps in 2.03 usec
> 38: 512 bytes 24521 times --> 2013.16 Mbps in 1.94 usec
> 39: 515 bytes 25869 times --> 2038.18 Mbps in 1.93 usec
> 40: 765 bytes 26188 times --> 2474.81 Mbps in 2.36 usec
> 41: 768 bytes 28268 times --> 2513.00 Mbps in 2.33 usec
> 42: 771 bytes 28648 times --> 2531.45 Mbps in 2.32 usec
> 43: 1021 bytes 14512 times --> 2831.70 Mbps in 2.75 usec
> 44: 1024 bytes 18158 times --> 2853.94 Mbps in 2.74 usec
> 45: 1027 bytes 18300 times --> 2872.58 Mbps in 2.73 usec
> 46: 1533 bytes 18420 times --> 3298.65 Mbps in 3.55 usec
> 47: 1536 bytes 18802 times --> 3320.86 Mbps in 3.53 usec
> 48: 1539 bytes 18910 times --> 3351.99 Mbps in 3.50 usec
> 49: 2045 bytes 9571 times --> 3599.21 Mbps in 4.33 usec
> 50: 2048 bytes 11528 times --> 3640.91 Mbps in 4.29 usec
> 51: 2051 bytes 11662 times --> 3638.62 Mbps in 4.30 usec
> 52: 3069 bytes 11654 times --> 3905.17 Mbps in 6.00 usec
> 53: 3072 bytes 11118 times --> 3917.67 Mbps in 5.98 usec
> 54: 3075 bytes 11149 times --> 3973.53 Mbps in 5.90 usec
> 55: 4093 bytes 5662 times --> 4450.80 Mbps in 7.02 usec
> 56: 4096 bytes 7124 times --> 4445.17 Mbps in 7.03 usec
> 57: 4099 bytes 7115 times --> 4412.88 Mbps in 7.09 usec
> 58: 6141 bytes 7064 times --> 4962.74 Mbps in 9.44 usec
> 59: 6144 bytes 7061 times --> 4941.94 Mbps in 9.49 usec
> 60: 6147 bytes 7030 times --> 4938.46 Mbps in 9.50 usec
> 61: 8189 bytes 3515 times --> 5263.65 Mbps in 11.87 usec
> 62: 8192 bytes 4211 times --> 5249.31 Mbps in 11.91 usec
> 63: 8195 bytes 4200 times --> 5202.08 Mbps in 12.02 usec
> 64: 12285 bytes 4162 times --> 6380.89 Mbps in 14.69 usec
> 65: 12288 bytes 4538 times --> 6385.27 Mbps in 14.68 usec
> 66: 12291 bytes 4541 times --> 6335.05 Mbps in 14.80 usec
> 67: 16381 bytes 2253 times --> 6535.76 Mbps in 19.12 usec
> 68: 16384 bytes 2614 times --> 6537.24 Mbps in 19.12 usec
> 69: 16387 bytes 2615 times --> 6514.52 Mbps in 19.19 usec
> 70: 24573 bytes 2606 times --> 6870.51 Mbps in 27.29 usec
> 71: 24576 bytes 2443 times --> 6866.57 Mbps in 27.31 usec
> 72: 24579 bytes 2441 times --> 6864.32 Mbps in 27.32 usec
> 73: 32765 bytes 1220 times --> 7124.85 Mbps in 35.09 usec
> 74: 32768 bytes 1425 times --> 7120.30 Mbps in 35.11 usec
> 75: 32771 bytes 1424 times --> 7127.15 Mbps in 35.08 usec
> 76: 49149 bytes 1425 times --> 8313.31 Mbps in 45.11 usec
> 77: 49152 bytes 1478 times --> 8312.58 Mbps in 45.11 usec
> 78: 49155 bytes 1477 times --> 8309.34 Mbps in 45.13 usec
> 79: 65533 bytes 738 times --> 8219.82 Mbps in 60.83 usec
> 80: 65536 bytes 822 times --> 8209.24 Mbps in 60.91 usec
> 81: 65539 bytes 820 times --> 8216.00 Mbps in 60.86 usec
> 82: 98301 bytes 821 times --> 8698.24 Mbps in 86.22 usec
> 83: 98304 bytes 773 times --> 8695.03 Mbps in 86.26 usec
> 84: 98307 bytes 772 times --> 8696.95 Mbps in 86.24 usec
> 85: 131069 bytes 386 times --> 8916.50 Mbps in 112.15 usec
> 86: 131072 bytes 445 times --> 8917.29 Mbps in 112.14 usec
> 87: 131075 bytes 445 times --> 8916.62 Mbps in 112.15 usec
> 88: 196605 bytes 445 times --> 9205.17 Mbps in 162.95 usec
> 89: 196608 bytes 409 times --> 9195.75 Mbps in 163.12 usec
> 90: 196611 bytes 408 times --> 9203.02 Mbps in 162.99 usec
> 91: 262141 bytes 204 times --> 9338.32 Mbps in 214.17 usec
> 92: 262144 bytes 233 times --> 9350.57 Mbps in 213.89 usec
> 93: 262147 bytes 233 times --> 9336.72 Mbps in 214.21 usec
> 94: 393213 bytes 233 times --> 9480.21 Mbps in 316.45 usec
> 95: 393216 bytes 210 times --> 9476.10 Mbps in 316.59 usec
> 96: 393219 bytes 210 times --> 9471.25 Mbps in 316.75 usec
> 97: 524285 bytes 105 times --> 9523.20 Mbps in 420.02 usec
> 98: 524288 bytes 119 times --> 9519.53 Mbps in 420.19 usec
> 99: 524291 bytes 118 times --> 9523.09 Mbps in 420.03 usec
> 100: 786429 bytes 119 times --> 9555.83 Mbps in 627.89 usec
> 101: 786432 bytes 106 times --> 9542.67 Mbps in 628.75 usec
> 102: 786435 bytes 106 times --> 9554.47 Mbps in 627.98 usec
> 103: 1048573 bytes 53 times --> 9527.96 Mbps in 839.63 usec
> 104: 1048576 bytes 59 times --> 9530.63 Mbps in 839.40 usec
> 105: 1048579 bytes 59 times --> 9500.65 Mbps in 842.05 usec
> 106: 1572861 bytes 59 times --> 9389.53 Mbps in 1278.02 usec
> 107: 1572864 bytes 52 times --> 9396.87 Mbps in 1277.02 usec
> 108: 1572867 bytes 52 times --> 9375.01 Mbps in 1280.00 usec
> 109: 2097149 bytes 26 times --> 9271.33 Mbps in 1725.75 usec
> 110: 2097152 bytes 28 times --> 9273.64 Mbps in 1725.32 usec
> 111: 2097155 bytes 28 times --> 9281.42 Mbps in 1723.88 usec
> 112: 3145725 bytes 29 times --> 9109.93 Mbps in 2634.48 usec
> 113: 3145728 bytes 25 times --> 9128.80 Mbps in 2629.04 usec
> 114: 3145731 bytes 25 times --> 9099.66 Mbps in 2637.46 usec
> 115: 4194301 bytes 12 times --> 8840.19 Mbps in 3619.83 usec
> 116: 4194304 bytes 13 times --> 8847.10 Mbps in 3617.00 usec
> 117: 4194307 bytes 13 times --> 8827.22 Mbps in 3625.15 usec
> 118: 6291453 bytes 13 times --> 8351.40 Mbps in 5747.54 usec
> 119: 6291456 bytes 11 times --> 8345.46 Mbps in 5751.63 usec
> 120: 6291459 bytes 11 times --> 8343.42 Mbps in 5753.04 usec
> 121: 8388605 bytes 5 times --> 8166.28 Mbps in 7837.10 usec
> 122: 8388608 bytes 6 times --> 8166.91 Mbps in 7836.50 usec
> 123: 8388611 bytes 6 times --> 8162.67 Mbps in 7840.57 usec
> [6:29] svbu-mpi052:~/svn/ompi-tests/NetPIPE-3.7.1 % cd ../osu/
> [6:29] svbu-mpi052:~/svn/ompi-tests/osu % mpirun --mca
> mpi_paffinity_alone 1 -np 2 --mca btl sm,self osu_latency
> # OSU MPI Latency Test (Version 2.1)
> # Size Latency (us)
> 0 0.85
> 1 0.91
> 2 0.91
> 4 0.99
> 8 0.99
> 16 0.99
> 32 1.08
> 64 1.08
> 128 1.25
> 256 1.49
> 512 1.92
> 1024 2.71
> 2048 4.40
> 4096 6.85
> 8192 11.48
> 16384 19.25
> 32768 35.25
> 65536 61.03
> 131072 113.15
> 262144 215.54
> 524288 428.19
> 1048576 880.72
> 2097152 1839.12
> 4194304 3934.90
> [6:29] svbu-mpi052:~/svn/ompi-tests/osu %
>
> ================================================================
> r18973
>
> [6:36] svbu-mpi052:~/svn/ompi-tests/NetPIPE-3.7.1 % mpirun --mca
> mpi_paffinity_alone 1 -np 2 --mca btl sm,self NPmpi
> 1: svbu-mpi052
> 0: svbu-mpi052
> Now starting the main loop
> 0: 1 bytes 84392 times --> 8.29 Mbps in 0.92 usec
> 1: 2 bytes 108626 times --> 16.58 Mbps in 0.92 usec
> 2: 3 bytes 108657 times --> 24.91 Mbps in 0.92 usec
> 3: 4 bytes 72561 times --> 30.33 Mbps in 1.01 usec
> 4: 6 bytes 74529 times --> 45.51 Mbps in 1.01 usec
> 5: 8 bytes 49709 times --> 60.76 Mbps in 1.00 usec
> 6: 12 bytes 62222 times --> 90.84 Mbps in 1.01 usec
> 7: 13 bytes 41344 times --> 98.58 Mbps in 1.01 usec
> 8: 16 bytes 45875 times --> 121.19 Mbps in 1.01 usec
> 9: 19 bytes 55845 times --> 143.43 Mbps in 1.01 usec
> 10: 21 bytes 62491 times --> 156.66 Mbps in 1.02 usec
> 11: 24 bytes 65185 times --> 177.87 Mbps in 1.03 usec
> 12: 27 bytes 68806 times --> 187.63 Mbps in 1.10 usec
> 13: 29 bytes 40482 times --> 202.10 Mbps in 1.09 usec
> 14: 32 bytes 44096 times --> 222.11 Mbps in 1.10 usec
> 15: 35 bytes 48331 times --> 255.12 Mbps in 1.05 usec
> 16: 45 bytes 54593 times --> 308.42 Mbps in 1.11 usec
> 17: 48 bytes 59888 times --> 330.10 Mbps in 1.11 usec
> 18: 51 bytes 61970 times --> 348.31 Mbps in 1.12 usec
> 19: 61 bytes 35104 times --> 409.39 Mbps in 1.14 usec
> 20: 64 bytes 43261 times --> 451.69 Mbps in 1.08 usec
> 21: 67 bytes 47698 times --> 489.98 Mbps in 1.04 usec
> 22: 93 bytes 51504 times --> 565.69 Mbps in 1.25 usec
> 23: 96 bytes 53150 times --> 598.55 Mbps in 1.22 usec
> 24: 99 bytes 55333 times --> 623.24 Mbps in 1.21 usec
> 25: 125 bytes 30005 times --> 735.91 Mbps in 1.30 usec
> 26: 128 bytes 38274 times --> 781.32 Mbps in 1.25 usec
> 27: 131 bytes 40628 times --> 828.90 Mbps in 1.21 usec
> 28: 189 bytes 43050 times --> 1018.02 Mbps in 1.42 usec
> 29: 192 bytes 47066 times --> 1069.01 Mbps in 1.37 usec
> 30: 195 bytes 49032 times --> 1122.18 Mbps in 1.33 usec
> 31: 253 bytes 26303 times --> 1259.95 Mbps in 1.53 usec
> 32: 256 bytes 32508 times --> 1307.53 Mbps in 1.49 usec
> 33: 259 bytes 33734 times --> 1357.47 Mbps in 1.46 usec
> 34: 381 bytes 35011 times --> 1617.08 Mbps in 1.80 usec
> 35: 384 bytes 37087 times --> 1675.72 Mbps in 1.75 usec
> 36: 387 bytes 38280 times --> 1722.27 Mbps in 1.71 usec
> 37: 509 bytes 19895 times --> 1913.58 Mbps in 2.03 usec
> 38: 512 bytes 24589 times --> 1967.08 Mbps in 1.99 usec
> 39: 515 bytes 25276 times --> 2041.10 Mbps in 1.93 usec
> 40: 765 bytes 26226 times --> 2448.96 Mbps in 2.38 usec
> 41: 768 bytes 27973 times --> 2503.60 Mbps in 2.34 usec
> 42: 771 bytes 28541 times --> 2541.12 Mbps in 2.31 usec
> 43: 1021 bytes 14567 times --> 2845.46 Mbps in 2.74 usec
> 44: 1024 bytes 18246 times --> 2854.45 Mbps in 2.74 usec
> 45: 1027 bytes 18304 times --> 2939.64 Mbps in 2.67 usec
> 46: 1533 bytes 18850 times --> 3291.70 Mbps in 3.55 usec
> 47: 1536 bytes 18762 times --> 3310.45 Mbps in 3.54 usec
> 48: 1539 bytes 18851 times --> 3386.68 Mbps in 3.47 usec
> 49: 2045 bytes 9670 times --> 3635.22 Mbps in 4.29 usec
> 50: 2048 bytes 11644 times --> 3646.70 Mbps in 4.28 usec
> 51: 2051 bytes 11680 times --> 3640.09 Mbps in 4.30 usec
> 52: 3069 bytes 11659 times --> 3926.68 Mbps in 5.96 usec
> 53: 3072 bytes 11180 times --> 3962.33 Mbps in 5.92 usec
> 54: 3075 bytes 11276 times --> 3978.54 Mbps in 5.90 usec
> 55: 4093 bytes 5669 times --> 4398.66 Mbps in 7.10 usec
> 56: 4096 bytes 7041 times --> 4429.95 Mbps in 7.05 usec
> 57: 4099 bytes 7091 times --> 4378.99 Mbps in 7.14 usec
> 58: 6141 bytes 7009 times --> 5001.17 Mbps in 9.37 usec
> 59: 6144 bytes 7116 times --> 4984.01 Mbps in 9.41 usec
> 60: 6147 bytes 7090 times --> 5015.48 Mbps in 9.35 usec
> 61: 8189 bytes 3570 times --> 5286.90 Mbps in 11.82 usec
> 62: 8192 bytes 4230 times --> 5222.58 Mbps in 11.97 usec
> 63: 8195 bytes 4179 times --> 5261.91 Mbps in 11.88 usec
> 64: 12285 bytes 4210 times --> 6370.90 Mbps in 14.71 usec
> 65: 12288 bytes 4531 times --> 6376.57 Mbps in 14.70 usec
> 66: 12291 bytes 4535 times --> 6349.10 Mbps in 14.77 usec
> 67: 16381 bytes 2258 times --> 6521.57 Mbps in 19.16 usec
> 68: 16384 bytes 2608 times --> 6520.25 Mbps in 19.17 usec
> 69: 16387 bytes 2608 times --> 6504.81 Mbps in 19.22 usec
> 70: 24573 bytes 2602 times --> 6867.93 Mbps in 27.30 usec
> 71: 24576 bytes 2442 times --> 6869.27 Mbps in 27.30 usec
> 72: 24579 bytes 2442 times --> 6864.04 Mbps in 27.32 usec
> 73: 32765 bytes 1220 times --> 7118.03 Mbps in 35.12 usec
> 74: 32768 bytes 1423 times --> 7117.77 Mbps in 35.12 usec
> 75: 32771 bytes 1423 times --> 7120.85 Mbps in 35.11 usec
> 76: 49149 bytes 1424 times --> 8324.26 Mbps in 45.05 usec
> 77: 49152 bytes 1479 times --> 8328.77 Mbps in 45.02 usec
> 78: 49155 bytes 1480 times --> 8320.47 Mbps in 45.07 usec
> 79: 65533 bytes 739 times --> 8214.38 Mbps in 60.87 usec
> 80: 65536 bytes 821 times --> 8219.87 Mbps in 60.83 usec
> 81: 65539 bytes 822 times --> 8232.40 Mbps in 60.74 usec
> 82: 98301 bytes 823 times --> 8717.21 Mbps in 86.03 usec
> 83: 98304 bytes 774 times --> 8716.08 Mbps in 86.05 usec
> 84: 98307 bytes 774 times --> 8714.26 Mbps in 86.07 usec
> 85: 131069 bytes 387 times --> 8921.59 Mbps in 112.09 usec
> 86: 131072 bytes 446 times --> 8935.37 Mbps in 111.91 usec
> 87: 131075 bytes 446 times --> 8925.47 Mbps in 112.04 usec
> 88: 196605 bytes 446 times --> 9195.80 Mbps in 163.12 usec
> 89: 196608 bytes 408 times --> 9197.41 Mbps in 163.09 usec
> 90: 196611 bytes 408 times --> 9204.33 Mbps in 162.97 usec
> 91: 262141 bytes 204 times --> 9344.95 Mbps in 214.02 usec
> 92: 262144 bytes 233 times --> 9347.58 Mbps in 213.96 usec
> 93: 262147 bytes 233 times --> 9340.56 Mbps in 214.12 usec
> 94: 393213 bytes 233 times --> 9473.27 Mbps in 316.68 usec
> 95: 393216 bytes 210 times --> 9486.24 Mbps in 316.25 usec
> 96: 393219 bytes 210 times --> 9500.26 Mbps in 315.78 usec
> 97: 524285 bytes 105 times --> 9538.88 Mbps in 419.33 usec
> 98: 524288 bytes 119 times --> 9543.40 Mbps in 419.14 usec
> 99: 524291 bytes 119 times --> 9534.73 Mbps in 419.52 usec
> 100: 786429 bytes 119 times --> 9574.15 Mbps in 626.69 usec
> 101: 786432 bytes 106 times --> 9565.70 Mbps in 627.24 usec
> 102: 786435 bytes 106 times --> 9544.50 Mbps in 628.64 usec
> 103: 1048573 bytes 53 times --> 9530.85 Mbps in 839.38 usec
> 104: 1048576 bytes 59 times --> 9525.24 Mbps in 839.87 usec
> 105: 1048579 bytes 59 times --> 9511.86 Mbps in 841.06 usec
> 106: 1572861 bytes 59 times --> 9391.40 Mbps in 1277.76 usec
> 107: 1572864 bytes 52 times --> 9395.54 Mbps in 1277.20 usec
> 108: 1572867 bytes 52 times --> 9386.02 Mbps in 1278.50 usec
> 109: 2097149 bytes 26 times --> 9298.48 Mbps in 1720.71 usec
> 110: 2097152 bytes 29 times --> 9313.43 Mbps in 1717.95 usec
> 111: 2097155 bytes 29 times --> 9293.49 Mbps in 1721.64 usec
> 112: 3145725 bytes 29 times --> 9126.67 Mbps in 2629.65 usec
> 113: 3145728 bytes 25 times --> 9113.76 Mbps in 2633.38 usec
> 114: 3145731 bytes 25 times --> 9079.90 Mbps in 2643.20 usec
> 115: 4194301 bytes 12 times --> 8810.57 Mbps in 3632.00 usec
> 116: 4194304 bytes 13 times --> 8821.99 Mbps in 3627.30 usec
> 117: 4194307 bytes 13 times --> 8801.17 Mbps in 3635.88 usec
> 118: 6291453 bytes 13 times --> 8337.50 Mbps in 5757.12 usec
> 119: 6291456 bytes 11 times --> 8332.94 Mbps in 5760.27 usec
> 120: 6291459 bytes 11 times --> 8346.25 Mbps in 5751.09 usec
> 121: 8388605 bytes 5 times --> 8159.20 Mbps in 7843.90 usec
> 122: 8388608 bytes 6 times --> 8166.83 Mbps in 7836.58 usec
> 123: 8388611 bytes 6 times --> 8161.26 Mbps in 7841.92 usec
> [6:37] svbu-mpi052:~/svn/ompi-tests/NetPIPE-3.7.1 % cd ../osu/
> [6:37] svbu-mpi052:~/svn/ompi-tests/osu % mpirun --mca
> mpi_paffinity_alone 1 -np 2 --mca btl sm,self osu_latency
> # OSU MPI Latency Test (Version 2.1)
> # Size Latency (us)
> 0 0.85
> 1 0.91
> 2 0.91
> 4 0.99
> 8 0.99
> 16 0.99
> 32 1.09
> 64 1.07
> 128 1.25
> 256 1.49
> 512 1.97
> 1024 2.69
> 2048 4.29
> 4096 6.83
> 8192 11.41
> 16384 19.69
> 32768 35.27
> 65536 61.06
> 131072 112.51
> 262144 215.47
> 524288 429.60
> 1048576 882.89
> 2097152 1836.45
> 4194304 3943.47
> [6:37] svbu-mpi052:~/svn/ompi-tests/osu %
>
> ================================================================
> r18850
> [6:31] svbu-mpi052:~/svn/ompi-tests/NetPIPE-3.7.1 % mpirun --mca
> mpi_paffinity_alone 1 -np 2 --mca btl sm,self NPmpi
> 0: svbu-mpi052
> 1: svbu-mpi052
> Now starting the main loop
> 0: 1 bytes 116185 times --> 11.32 Mbps in 0.67 usec
> 1: 2 bytes 148348 times --> 22.58 Mbps in 0.68 usec
> 2: 3 bytes 147969 times --> 33.88 Mbps in 0.68 usec
> 3: 4 bytes 98695 times --> 40.58 Mbps in 0.75 usec
> 4: 6 bytes 99737 times --> 60.85 Mbps in 0.75 usec
> 5: 8 bytes 66464 times --> 81.13 Mbps in 0.75 usec
> 6: 12 bytes 83076 times --> 121.58 Mbps in 0.75 usec
> 7: 13 bytes 55334 times --> 131.83 Mbps in 0.75 usec
> 8: 16 bytes 61344 times --> 161.81 Mbps in 0.75 usec
> 9: 19 bytes 74561 times --> 190.93 Mbps in 0.76 usec
> 10: 21 bytes 83186 times --> 207.97 Mbps in 0.77 usec
> 11: 24 bytes 86535 times --> 235.30 Mbps in 0.78 usec
> 12: 27 bytes 91024 times --> 241.36 Mbps in 0.85 usec
> 13: 29 bytes 52074 times --> 260.24 Mbps in 0.85 usec
> 14: 32 bytes 56782 times --> 286.57 Mbps in 0.85 usec
> 15: 35 bytes 62357 times --> 341.55 Mbps in 0.78 usec
> 16: 45 bytes 73090 times --> 400.53 Mbps in 0.86 usec
> 17: 48 bytes 77776 times --> 425.94 Mbps in 0.86 usec
> 18: 51 bytes 79963 times --> 449.27 Mbps in 0.87 usec
> 19: 61 bytes 45280 times --> 520.58 Mbps in 0.89 usec
> 20: 64 bytes 55011 times --> 589.77 Mbps in 0.83 usec
> 21: 67 bytes 62279 times --> 651.96 Mbps in 0.78 usec
> 22: 93 bytes 68530 times --> 706.75 Mbps in 1.00 usec
> 23: 96 bytes 66405 times --> 756.56 Mbps in 0.97 usec
> 24: 99 bytes 69940 times --> 786.11 Mbps in 0.96 usec
> 25: 125 bytes 37846 times --> 917.31 Mbps in 1.04 usec
> 26: 128 bytes 47708 times --> 991.21 Mbps in 0.99 usec
> 27: 131 bytes 51542 times --> 1030.40 Mbps in 0.97 usec
> 28: 189 bytes 53515 times --> 1228.14 Mbps in 1.17 usec
> 29: 192 bytes 56781 times --> 1317.94 Mbps in 1.11 usec
> 30: 195 bytes 60449 times --> 1372.28 Mbps in 1.08 usec
> 31: 253 bytes 32165 times --> 1506.60 Mbps in 1.28 usec
> 32: 256 bytes 38871 times --> 1590.08 Mbps in 1.23 usec
> 33: 259 bytes 41024 times --> 1657.90 Mbps in 1.19 usec
> 34: 381 bytes 42760 times --> 1894.98 Mbps in 1.53 usec
> 35: 384 bytes 43460 times --> 1958.92 Mbps in 1.50 usec
> 36: 387 bytes 44750 times --> 2029.44 Mbps in 1.45 usec
> 37: 509 bytes 23444 times --> 2176.96 Mbps in 1.78 usec
> 38: 512 bytes 27974 times --> 2268.97 Mbps in 1.72 usec
> 39: 515 bytes 29156 times --> 2340.62 Mbps in 1.68 usec
> 40: 765 bytes 30074 times --> 2698.17 Mbps in 2.16 usec
> 41: 768 bytes 30819 times --> 2778.48 Mbps in 2.11 usec
> 42: 771 bytes 31674 times --> 2847.11 Mbps in 2.07 usec
> 43: 1021 bytes 16322 times --> 3039.90 Mbps in 2.56 usec
> 44: 1024 bytes 19493 times --> 3161.06 Mbps in 2.47 usec
> 45: 1027 bytes 20270 times --> 3221.90 Mbps in 2.43 usec
> 46: 1533 bytes 20660 times --> 3455.95 Mbps in 3.38 usec
> 47: 1536 bytes 19698 times --> 3580.63 Mbps in 3.27 usec
> 48: 1539 bytes 20389 times --> 3623.40 Mbps in 3.24 usec
> 49: 2045 bytes 10346 times --> 3751.80 Mbps in 4.16 usec
> 50: 2048 bytes 12017 times --> 3833.40 Mbps in 4.08 usec
> 51: 2051 bytes 12278 times --> 3813.67 Mbps in 4.10 usec
> 52: 3069 bytes 12215 times --> 3997.25 Mbps in 5.86 usec
> 53: 3072 bytes 11381 times --> 4058.18 Mbps in 5.78 usec
> 54: 3075 bytes 11548 times --> 4102.09 Mbps in 5.72 usec
> 55: 4093 bytes 5845 times --> 4726.24 Mbps in 6.61 usec
> 56: 4096 bytes 7565 times --> 4679.74 Mbps in 6.68 usec
> 57: 4099 bytes 7491 times --> 4649.50 Mbps in 6.73 usec
> 58: 6141 bytes 7442 times --> 5072.39 Mbps in 9.24 usec
> 59: 6144 bytes 7217 times --> 5064.70 Mbps in 9.26 usec
> 60: 6147 bytes 7204 times --> 5067.07 Mbps in 9.26 usec
> 61: 8189 bytes 3606 times --> 5387.85 Mbps in 11.60 usec
> 62: 8192 bytes 4311 times --> 5393.87 Mbps in 11.59 usec
> 63: 8195 bytes 4316 times --> 5301.81 Mbps in 11.79 usec
> 64: 12285 bytes 4242 times --> 6568.81 Mbps in 14.27 usec
> 65: 12288 bytes 4672 times --> 6561.90 Mbps in 14.29 usec
> 66: 12291 bytes 4666 times --> 6548.01 Mbps in 14.32 usec
> 67: 16381 bytes 2329 times --> 6662.43 Mbps in 18.76 usec
> 68: 16384 bytes 2665 times --> 6655.18 Mbps in 18.78 usec
> 69: 16387 bytes 2662 times --> 6634.79 Mbps in 18.84 usec
> 70: 24573 bytes 2654 times --> 6937.26 Mbps in 27.02 usec
> 71: 24576 bytes 2466 times --> 6937.41 Mbps in 27.03 usec
> 72: 24579 bytes 2466 times --> 6931.40 Mbps in 27.05 usec
> 73: 32765 bytes 1232 times --> 7218.55 Mbps in 34.63 usec
> 74: 32768 bytes 1443 times --> 7213.85 Mbps in 34.66 usec
> 75: 32771 bytes 1442 times --> 7218.89 Mbps in 34.63 usec
> 76: 49149 bytes 1443 times --> 8387.79 Mbps in 44.71 usec
> 77: 49152 bytes 1491 times --> 8385.50 Mbps in 44.72 usec
> 78: 49155 bytes 1490 times --> 8390.79 Mbps in 44.69 usec
> 79: 65533 bytes 745 times --> 8261.32 Mbps in 60.52 usec
> 80: 65536 bytes 826 times --> 8260.34 Mbps in 60.53 usec
> 81: 65539 bytes 826 times --> 8265.33 Mbps in 60.50 usec
> 82: 98301 bytes 826 times --> 8747.13 Mbps in 85.74 usec
> 83: 98304 bytes 777 times --> 8746.72 Mbps in 85.75 usec
> 84: 98307 bytes 777 times --> 8733.81 Mbps in 85.88 usec
> 85: 131069 bytes 388 times --> 8956.71 Mbps in 111.65 usec
> 86: 131072 bytes 447 times --> 8967.16 Mbps in 111.52 usec
> 87: 131075 bytes 448 times --> 8960.56 Mbps in 111.60 usec
> 88: 196605 bytes 448 times --> 9247.58 Mbps in 162.20 usec
> 89: 196608 bytes 411 times --> 9234.30 Mbps in 162.44 usec
> 90: 196611 bytes 410 times --> 9231.32 Mbps in 162.49 usec
> 91: 262141 bytes 205 times --> 9365.98 Mbps in 213.54 usec
> 92: 262144 bytes 234 times --> 9368.25 Mbps in 213.49 usec
> 93: 262147 bytes 234 times --> 9363.09 Mbps in 213.61 usec
> 94: 393213 bytes 234 times --> 9512.63 Mbps in 315.37 usec
> 95: 393216 bytes 211 times --> 9497.01 Mbps in 315.89 usec
> 96: 393219 bytes 211 times --> 9510.80 Mbps in 315.43 usec
> 97: 524285 bytes 105 times --> 9553.55 Mbps in 418.69 usec
> 98: 524288 bytes 119 times --> 9561.59 Mbps in 418.34 usec
> 99: 524291 bytes 119 times --> 9551.86 Mbps in 418.77 usec
> 100: 786429 bytes 119 times --> 9582.63 Mbps in 626.13 usec
> 101: 786432 bytes 106 times --> 9576.72 Mbps in 626.52 usec
> 102: 786435 bytes 106 times --> 9584.78 Mbps in 625.99 usec
> 103: 1048573 bytes 53 times --> 9545.32 Mbps in 838.10 usec
> 104: 1048576 bytes 59 times --> 9532.37 Mbps in 839.25 usec
> 105: 1048579 bytes 59 times --> 9542.90 Mbps in 838.32 usec
> 106: 1572861 bytes 59 times --> 9434.44 Mbps in 1271.93 usec
> 107: 1572864 bytes 52 times --> 9400.64 Mbps in 1276.51 usec
> 108: 1572867 bytes 52 times --> 9409.24 Mbps in 1275.34 usec
> 109: 2097149 bytes 26 times --> 9305.75 Mbps in 1719.36 usec
> 110: 2097152 bytes 29 times --> 9314.56 Mbps in 1717.74 usec
> 111: 2097155 bytes 29 times --> 9278.43 Mbps in 1724.43 usec
> 112: 3145725 bytes 28 times --> 9065.15 Mbps in 2647.50 usec
> 113: 3145728 bytes 25 times --> 9095.10 Mbps in 2638.78 usec
> 114: 3145731 bytes 25 times --> 9073.88 Mbps in 2644.96 usec
> 115: 4194301 bytes 12 times --> 8772.63 Mbps in 3647.70 usec
> 116: 4194304 bytes 13 times --> 8768.32 Mbps in 3649.50 usec
> 117: 4194307 bytes 13 times --> 8771.37 Mbps in 3648.24 usec
> 118: 6291453 bytes 13 times --> 8321.22 Mbps in 5768.38 usec
> 119: 6291456 bytes 11 times --> 8320.00 Mbps in 5769.23 usec
> 120: 6291459 bytes 11 times --> 8335.25 Mbps in 5758.68 usec
> 121: 8388605 bytes 5 times --> 8167.02 Mbps in 7836.39 usec
> 122: 8388608 bytes 6 times --> 8165.44 Mbps in 7837.91 usec
> 123: 8388611 bytes 6 times --> 8162.24 Mbps in 7840.99 usec
> [6:32] svbu-mpi052:~/svn/ompi-tests/NetPIPE-3.7.1 % cd ../osu/
> [6:32] svbu-mpi052:~/svn/ompi-tests/osu % mpirun --mca
> mpi_paffinity_alone 1 -np 2 --mca btl sm,self osu_latency
> # OSU MPI Latency Test (Version 2.1)
> # Size Latency (us)
> 0 0.65
> 1 0.69
> 2 0.69
> 4 0.76
> 8 0.76
> 16 0.76
> 32 0.85
> 64 0.83
> 128 1.03
> 256 1.25
> 512 1.73
> 1024 2.47
> 2048 4.18
> 4096 6.53
> 8192 11.23
> 16384 18.91
> 32768 34.97
> 65536 60.80
> 131072 112.09
> 262144 215.15
> 524288 427.97
> 1048576 880.90
> 2097152 1840.40
> 4194304 3945.23
> [6:33] svbu-mpi052:~/svn/ompi-tests/osu %
>
>
>
> On Jul 23, 2008, at 7:24 AM, Lenny Verkhovsky wrote:
>
>> Sorry Terry, :).
>>
>> ---------- Forwarded message ----------
>> From: Lenny Verkhovsky <lenny.verkhovsky_at_[hidden]>
>> Date: Jul 23, 2008 2:22 PM
>> Subject: Re: [OMPI devel] [OMPI bugs] [Open MPI] #1250: Performance
>> problem on SM
>> To: Lenny Berkhovsky <lenny.verkhovsky_at_[hidden]>
>>
>>
>>
>> On 7/23/08, Terry Dontje <Terry.Dontje_at_[hidden]> wrote: I didn't see
>> any attached results on the email.
>>
>> --td
>> Lenny Verkhovsky wrote:
>>
>> I rechecked in on the same node, still no degradation,
>>
>> see results attached.
>>
>>
>> On 7/22/08, *Open MPI* <bugs_at_[hidden] <mailto:bugs_at_open-
>> mpi.org>> wrote:
>>
>> #1250: Performance problem on SM
>> --------------------
>> +-------------------------------------------------------
>> Reporter: bosilca | Owner: bosilca
>> Type: defect | Status: assigned
>> Priority: blocker | Milestone: Open MPI 1.3
>> Version: | Resolution:
>> Keywords: |
>> --------------------
>> +-------------------------------------------------------
>>
>>
>> Comment(by tdd):
>>
>> Hmmm, Lennyve isn't your mpirun above going across nodes and not
>> on the
>> same node? I am running netpipe on a single node.
>>
>>
>> --
>> Ticket URL:
>> <https://svn.open-mpi.org/trac/ompi/ticket/1250#comment:20>
>>
>> Open MPI <http://www.open-mpi.org/>
>>
>>
>> _______________________________________________
>> bugs mailing list
>> bugs_at_[hidden] <mailto:bugs_at_[hidden]>
>> http://www.open-mpi.org/mailman/listinfo.cgi/bugs
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>>
>>
>> <NPmpi.log>_______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> --
> Jeff Squyres
> Cisco Systems
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel



  • application/pkcs7-signature attachment: smime.p7s