Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI devel] Fwd: [OMPI bugs] [Open MPI] #1250: Performance problem on SM
From: George Bosilca (bosilca_at_[hidden])
Date: 2008-07-23 10:01:43


Can you try the HEAD with the mpi_yield_when_idle set to 0 please.

   Thanks,
     george.

On Jul 23, 2008, at 3:39 PM, Jeff Squyres wrote:

> Short version: I'm seeing a large performance drop between r18850
> and the SVN HEAD.
>
> Longer version:
>
> FWIW, I ran the tests on 3 versions on a woodcrest-class x86_64
> machine running RHEL4U4:
>
> * Trunk HEAD (r18997)
> * r18973 --> had to patch the cpu64* thingy in openib btl to get it
> to compile
> * r18850
>
> I ran both osu_latency and NetPIPE 3.7.1. In the r18997 and r18973,
> the latency for short sends over sm is *significantly* higher than
> that of r18850. Detailed results below.
>
> ================================================================
> r18997
>
> [6:27] svbu-mpi052:~/svn/ompi-tests/NetPIPE-3.7.1 % mpirun --mca
> mpi_paffinity_alone 1 -np 2 --mca btl sm,self NPmpi
> 0: svbu-mpi052
> 1: svbu-mpi052
> Now starting the main loop
> 0: 1 bytes 85423 times --> 8.23 Mbps in 0.93 usec
> 1: 2 bytes 107852 times --> 16.46 Mbps in 0.93 usec
> 2: 3 bytes 107874 times --> 24.65 Mbps in 0.93 usec
> 3: 4 bytes 71801 times --> 30.36 Mbps in 1.01 usec
> 4: 6 bytes 74610 times --> 45.27 Mbps in 1.01 usec
> 5: 8 bytes 49448 times --> 60.59 Mbps in 1.01 usec
> 6: 12 bytes 62044 times --> 90.72 Mbps in 1.01 usec
> 7: 13 bytes 41287 times --> 98.58 Mbps in 1.01 usec
> 8: 16 bytes 45872 times --> 120.81 Mbps in 1.01 usec
> 9: 19 bytes 55670 times --> 143.78 Mbps in 1.01 usec
> 10: 21 bytes 62644 times --> 156.63 Mbps in 1.02 usec
> 11: 24 bytes 65172 times --> 177.63 Mbps in 1.03 usec
> 12: 27 bytes 68714 times --> 187.21 Mbps in 1.10 usec
> 13: 29 bytes 40392 times --> 201.05 Mbps in 1.10 usec
> 14: 32 bytes 43868 times --> 220.92 Mbps in 1.11 usec
> 15: 35 bytes 48072 times --> 255.73 Mbps in 1.04 usec
> 16: 45 bytes 54725 times --> 308.90 Mbps in 1.11 usec
> 17: 48 bytes 59983 times --> 329.04 Mbps in 1.11 usec
> 18: 51 bytes 61772 times --> 348.53 Mbps in 1.12 usec
> 19: 61 bytes 35126 times --> 408.86 Mbps in 1.14 usec
> 20: 64 bytes 43206 times --> 453.67 Mbps in 1.08 usec
> 21: 67 bytes 47907 times --> 487.77 Mbps in 1.05 usec
> 22: 93 bytes 51271 times --> 561.32 Mbps in 1.26 usec
> 23: 96 bytes 52741 times --> 595.08 Mbps in 1.23 usec
> 24: 99 bytes 55012 times --> 617.64 Mbps in 1.22 usec
> 25: 125 bytes 29735 times --> 736.44 Mbps in 1.29 usec
> 26: 128 bytes 38301 times --> 779.33 Mbps in 1.25 usec
> 27: 131 bytes 40525 times --> 818.32 Mbps in 1.22 usec
> 28: 189 bytes 42501 times --> 1007.67 Mbps in 1.43 usec
> 29: 192 bytes 46588 times --> 1084.13 Mbps in 1.35 usec
> 30: 195 bytes 49725 times --> 1128.97 Mbps in 1.32 usec
> 31: 253 bytes 26462 times --> 1257.97 Mbps in 1.53 usec
> 32: 256 bytes 32457 times --> 1304.17 Mbps in 1.50 usec
> 33: 259 bytes 33647 times --> 1354.14 Mbps in 1.46 usec
> 34: 381 bytes 34925 times --> 1616.43 Mbps in 1.80 usec
> 35: 384 bytes 37072 times --> 1676.92 Mbps in 1.75 usec
> 36: 387 bytes 38308 times --> 1724.50 Mbps in 1.71 usec
> 37: 509 bytes 19921 times --> 1908.30 Mbps in 2.03 usec
> 38: 512 bytes 24521 times --> 2013.16 Mbps in 1.94 usec
> 39: 515 bytes 25869 times --> 2038.18 Mbps in 1.93 usec
> 40: 765 bytes 26188 times --> 2474.81 Mbps in 2.36 usec
> 41: 768 bytes 28268 times --> 2513.00 Mbps in 2.33 usec
> 42: 771 bytes 28648 times --> 2531.45 Mbps in 2.32 usec
> 43: 1021 bytes 14512 times --> 2831.70 Mbps in 2.75 usec
> 44: 1024 bytes 18158 times --> 2853.94 Mbps in 2.74 usec
> 45: 1027 bytes 18300 times --> 2872.58 Mbps in 2.73 usec
> 46: 1533 bytes 18420 times --> 3298.65 Mbps in 3.55 usec
> 47: 1536 bytes 18802 times --> 3320.86 Mbps in 3.53 usec
> 48: 1539 bytes 18910 times --> 3351.99 Mbps in 3.50 usec
> 49: 2045 bytes 9571 times --> 3599.21 Mbps in 4.33 usec
> 50: 2048 bytes 11528 times --> 3640.91 Mbps in 4.29 usec
> 51: 2051 bytes 11662 times --> 3638.62 Mbps in 4.30 usec
> 52: 3069 bytes 11654 times --> 3905.17 Mbps in 6.00 usec
> 53: 3072 bytes 11118 times --> 3917.67 Mbps in 5.98 usec
> 54: 3075 bytes 11149 times --> 3973.53 Mbps in 5.90 usec
> 55: 4093 bytes 5662 times --> 4450.80 Mbps in 7.02 usec
> 56: 4096 bytes 7124 times --> 4445.17 Mbps in 7.03 usec
> 57: 4099 bytes 7115 times --> 4412.88 Mbps in 7.09 usec
> 58: 6141 bytes 7064 times --> 4962.74 Mbps in 9.44 usec
> 59: 6144 bytes 7061 times --> 4941.94 Mbps in 9.49 usec
> 60: 6147 bytes 7030 times --> 4938.46 Mbps in 9.50 usec
> 61: 8189 bytes 3515 times --> 5263.65 Mbps in 11.87 usec
> 62: 8192 bytes 4211 times --> 5249.31 Mbps in 11.91 usec
> 63: 8195 bytes 4200 times --> 5202.08 Mbps in 12.02 usec
> 64: 12285 bytes 4162 times --> 6380.89 Mbps in 14.69 usec
> 65: 12288 bytes 4538 times --> 6385.27 Mbps in 14.68 usec
> 66: 12291 bytes 4541 times --> 6335.05 Mbps in 14.80 usec
> 67: 16381 bytes 2253 times --> 6535.76 Mbps in 19.12 usec
> 68: 16384 bytes 2614 times --> 6537.24 Mbps in 19.12 usec
> 69: 16387 bytes 2615 times --> 6514.52 Mbps in 19.19 usec
> 70: 24573 bytes 2606 times --> 6870.51 Mbps in 27.29 usec
> 71: 24576 bytes 2443 times --> 6866.57 Mbps in 27.31 usec
> 72: 24579 bytes 2441 times --> 6864.32 Mbps in 27.32 usec
> 73: 32765 bytes 1220 times --> 7124.85 Mbps in 35.09 usec
> 74: 32768 bytes 1425 times --> 7120.30 Mbps in 35.11 usec
> 75: 32771 bytes 1424 times --> 7127.15 Mbps in 35.08 usec
> 76: 49149 bytes 1425 times --> 8313.31 Mbps in 45.11 usec
> 77: 49152 bytes 1478 times --> 8312.58 Mbps in 45.11 usec
> 78: 49155 bytes 1477 times --> 8309.34 Mbps in 45.13 usec
> 79: 65533 bytes 738 times --> 8219.82 Mbps in 60.83 usec
> 80: 65536 bytes 822 times --> 8209.24 Mbps in 60.91 usec
> 81: 65539 bytes 820 times --> 8216.00 Mbps in 60.86 usec
> 82: 98301 bytes 821 times --> 8698.24 Mbps in 86.22 usec
> 83: 98304 bytes 773 times --> 8695.03 Mbps in 86.26 usec
> 84: 98307 bytes 772 times --> 8696.95 Mbps in 86.24 usec
> 85: 131069 bytes 386 times --> 8916.50 Mbps in 112.15 usec
> 86: 131072 bytes 445 times --> 8917.29 Mbps in 112.14 usec
> 87: 131075 bytes 445 times --> 8916.62 Mbps in 112.15 usec
> 88: 196605 bytes 445 times --> 9205.17 Mbps in 162.95 usec
> 89: 196608 bytes 409 times --> 9195.75 Mbps in 163.12 usec
> 90: 196611 bytes 408 times --> 9203.02 Mbps in 162.99 usec
> 91: 262141 bytes 204 times --> 9338.32 Mbps in 214.17 usec
> 92: 262144 bytes 233 times --> 9350.57 Mbps in 213.89 usec
> 93: 262147 bytes 233 times --> 9336.72 Mbps in 214.21 usec
> 94: 393213 bytes 233 times --> 9480.21 Mbps in 316.45 usec
> 95: 393216 bytes 210 times --> 9476.10 Mbps in 316.59 usec
> 96: 393219 bytes 210 times --> 9471.25 Mbps in 316.75 usec
> 97: 524285 bytes 105 times --> 9523.20 Mbps in 420.02 usec
> 98: 524288 bytes 119 times --> 9519.53 Mbps in 420.19 usec
> 99: 524291 bytes 118 times --> 9523.09 Mbps in 420.03 usec
> 100: 786429 bytes 119 times --> 9555.83 Mbps in 627.89 usec
> 101: 786432 bytes 106 times --> 9542.67 Mbps in 628.75 usec
> 102: 786435 bytes 106 times --> 9554.47 Mbps in 627.98 usec
> 103: 1048573 bytes 53 times --> 9527.96 Mbps in 839.63 usec
> 104: 1048576 bytes 59 times --> 9530.63 Mbps in 839.40 usec
> 105: 1048579 bytes 59 times --> 9500.65 Mbps in 842.05 usec
> 106: 1572861 bytes 59 times --> 9389.53 Mbps in 1278.02 usec
> 107: 1572864 bytes 52 times --> 9396.87 Mbps in 1277.02 usec
> 108: 1572867 bytes 52 times --> 9375.01 Mbps in 1280.00 usec
> 109: 2097149 bytes 26 times --> 9271.33 Mbps in 1725.75 usec
> 110: 2097152 bytes 28 times --> 9273.64 Mbps in 1725.32 usec
> 111: 2097155 bytes 28 times --> 9281.42 Mbps in 1723.88 usec
> 112: 3145725 bytes 29 times --> 9109.93 Mbps in 2634.48 usec
> 113: 3145728 bytes 25 times --> 9128.80 Mbps in 2629.04 usec
> 114: 3145731 bytes 25 times --> 9099.66 Mbps in 2637.46 usec
> 115: 4194301 bytes 12 times --> 8840.19 Mbps in 3619.83 usec
> 116: 4194304 bytes 13 times --> 8847.10 Mbps in 3617.00 usec
> 117: 4194307 bytes 13 times --> 8827.22 Mbps in 3625.15 usec
> 118: 6291453 bytes 13 times --> 8351.40 Mbps in 5747.54 usec
> 119: 6291456 bytes 11 times --> 8345.46 Mbps in 5751.63 usec
> 120: 6291459 bytes 11 times --> 8343.42 Mbps in 5753.04 usec
> 121: 8388605 bytes 5 times --> 8166.28 Mbps in 7837.10 usec
> 122: 8388608 bytes 6 times --> 8166.91 Mbps in 7836.50 usec
> 123: 8388611 bytes 6 times --> 8162.67 Mbps in 7840.57 usec
> [6:29] svbu-mpi052:~/svn/ompi-tests/NetPIPE-3.7.1 % cd ../osu/
> [6:29] svbu-mpi052:~/svn/ompi-tests/osu % mpirun --mca
> mpi_paffinity_alone 1 -np 2 --mca btl sm,self osu_latency
> # OSU MPI Latency Test (Version 2.1)
> # Size Latency (us)
> 0 0.85
> 1 0.91
> 2 0.91
> 4 0.99
> 8 0.99
> 16 0.99
> 32 1.08
> 64 1.08
> 128 1.25
> 256 1.49
> 512 1.92
> 1024 2.71
> 2048 4.40
> 4096 6.85
> 8192 11.48
> 16384 19.25
> 32768 35.25
> 65536 61.03
> 131072 113.15
> 262144 215.54
> 524288 428.19
> 1048576 880.72
> 2097152 1839.12
> 4194304 3934.90
> [6:29] svbu-mpi052:~/svn/ompi-tests/osu %
>
> ================================================================
> r18973
>
> [6:36] svbu-mpi052:~/svn/ompi-tests/NetPIPE-3.7.1 % mpirun --mca
> mpi_paffinity_alone 1 -np 2 --mca btl sm,self NPmpi
> 1: svbu-mpi052
> 0: svbu-mpi052
> Now starting the main loop
> 0: 1 bytes 84392 times --> 8.29 Mbps in 0.92 usec
> 1: 2 bytes 108626 times --> 16.58 Mbps in 0.92 usec
> 2: 3 bytes 108657 times --> 24.91 Mbps in 0.92 usec
> 3: 4 bytes 72561 times --> 30.33 Mbps in 1.01 usec
> 4: 6 bytes 74529 times --> 45.51 Mbps in 1.01 usec
> 5: 8 bytes 49709 times --> 60.76 Mbps in 1.00 usec
> 6: 12 bytes 62222 times --> 90.84 Mbps in 1.01 usec
> 7: 13 bytes 41344 times --> 98.58 Mbps in 1.01 usec
> 8: 16 bytes 45875 times --> 121.19 Mbps in 1.01 usec
> 9: 19 bytes 55845 times --> 143.43 Mbps in 1.01 usec
> 10: 21 bytes 62491 times --> 156.66 Mbps in 1.02 usec
> 11: 24 bytes 65185 times --> 177.87 Mbps in 1.03 usec
> 12: 27 bytes 68806 times --> 187.63 Mbps in 1.10 usec
> 13: 29 bytes 40482 times --> 202.10 Mbps in 1.09 usec
> 14: 32 bytes 44096 times --> 222.11 Mbps in 1.10 usec
> 15: 35 bytes 48331 times --> 255.12 Mbps in 1.05 usec
> 16: 45 bytes 54593 times --> 308.42 Mbps in 1.11 usec
> 17: 48 bytes 59888 times --> 330.10 Mbps in 1.11 usec
> 18: 51 bytes 61970 times --> 348.31 Mbps in 1.12 usec
> 19: 61 bytes 35104 times --> 409.39 Mbps in 1.14 usec
> 20: 64 bytes 43261 times --> 451.69 Mbps in 1.08 usec
> 21: 67 bytes 47698 times --> 489.98 Mbps in 1.04 usec
> 22: 93 bytes 51504 times --> 565.69 Mbps in 1.25 usec
> 23: 96 bytes 53150 times --> 598.55 Mbps in 1.22 usec
> 24: 99 bytes 55333 times --> 623.24 Mbps in 1.21 usec
> 25: 125 bytes 30005 times --> 735.91 Mbps in 1.30 usec
> 26: 128 bytes 38274 times --> 781.32 Mbps in 1.25 usec
> 27: 131 bytes 40628 times --> 828.90 Mbps in 1.21 usec
> 28: 189 bytes 43050 times --> 1018.02 Mbps in 1.42 usec
> 29: 192 bytes 47066 times --> 1069.01 Mbps in 1.37 usec
> 30: 195 bytes 49032 times --> 1122.18 Mbps in 1.33 usec
> 31: 253 bytes 26303 times --> 1259.95 Mbps in 1.53 usec
> 32: 256 bytes 32508 times --> 1307.53 Mbps in 1.49 usec
> 33: 259 bytes 33734 times --> 1357.47 Mbps in 1.46 usec
> 34: 381 bytes 35011 times --> 1617.08 Mbps in 1.80 usec
> 35: 384 bytes 37087 times --> 1675.72 Mbps in 1.75 usec
> 36: 387 bytes 38280 times --> 1722.27 Mbps in 1.71 usec
> 37: 509 bytes 19895 times --> 1913.58 Mbps in 2.03 usec
> 38: 512 bytes 24589 times --> 1967.08 Mbps in 1.99 usec
> 39: 515 bytes 25276 times --> 2041.10 Mbps in 1.93 usec
> 40: 765 bytes 26226 times --> 2448.96 Mbps in 2.38 usec
> 41: 768 bytes 27973 times --> 2503.60 Mbps in 2.34 usec
> 42: 771 bytes 28541 times --> 2541.12 Mbps in 2.31 usec
> 43: 1021 bytes 14567 times --> 2845.46 Mbps in 2.74 usec
> 44: 1024 bytes 18246 times --> 2854.45 Mbps in 2.74 usec
> 45: 1027 bytes 18304 times --> 2939.64 Mbps in 2.67 usec
> 46: 1533 bytes 18850 times --> 3291.70 Mbps in 3.55 usec
> 47: 1536 bytes 18762 times --> 3310.45 Mbps in 3.54 usec
> 48: 1539 bytes 18851 times --> 3386.68 Mbps in 3.47 usec
> 49: 2045 bytes 9670 times --> 3635.22 Mbps in 4.29 usec
> 50: 2048 bytes 11644 times --> 3646.70 Mbps in 4.28 usec
> 51: 2051 bytes 11680 times --> 3640.09 Mbps in 4.30 usec
> 52: 3069 bytes 11659 times --> 3926.68 Mbps in 5.96 usec
> 53: 3072 bytes 11180 times --> 3962.33 Mbps in 5.92 usec
> 54: 3075 bytes 11276 times --> 3978.54 Mbps in 5.90 usec
> 55: 4093 bytes 5669 times --> 4398.66 Mbps in 7.10 usec
> 56: 4096 bytes 7041 times --> 4429.95 Mbps in 7.05 usec
> 57: 4099 bytes 7091 times --> 4378.99 Mbps in 7.14 usec
> 58: 6141 bytes 7009 times --> 5001.17 Mbps in 9.37 usec
> 59: 6144 bytes 7116 times --> 4984.01 Mbps in 9.41 usec
> 60: 6147 bytes 7090 times --> 5015.48 Mbps in 9.35 usec
> 61: 8189 bytes 3570 times --> 5286.90 Mbps in 11.82 usec
> 62: 8192 bytes 4230 times --> 5222.58 Mbps in 11.97 usec
> 63: 8195 bytes 4179 times --> 5261.91 Mbps in 11.88 usec
> 64: 12285 bytes 4210 times --> 6370.90 Mbps in 14.71 usec
> 65: 12288 bytes 4531 times --> 6376.57 Mbps in 14.70 usec
> 66: 12291 bytes 4535 times --> 6349.10 Mbps in 14.77 usec
> 67: 16381 bytes 2258 times --> 6521.57 Mbps in 19.16 usec
> 68: 16384 bytes 2608 times --> 6520.25 Mbps in 19.17 usec
> 69: 16387 bytes 2608 times --> 6504.81 Mbps in 19.22 usec
> 70: 24573 bytes 2602 times --> 6867.93 Mbps in 27.30 usec
> 71: 24576 bytes 2442 times --> 6869.27 Mbps in 27.30 usec
> 72: 24579 bytes 2442 times --> 6864.04 Mbps in 27.32 usec
> 73: 32765 bytes 1220 times --> 7118.03 Mbps in 35.12 usec
> 74: 32768 bytes 1423 times --> 7117.77 Mbps in 35.12 usec
> 75: 32771 bytes 1423 times --> 7120.85 Mbps in 35.11 usec
> 76: 49149 bytes 1424 times --> 8324.26 Mbps in 45.05 usec
> 77: 49152 bytes 1479 times --> 8328.77 Mbps in 45.02 usec
> 78: 49155 bytes 1480 times --> 8320.47 Mbps in 45.07 usec
> 79: 65533 bytes 739 times --> 8214.38 Mbps in 60.87 usec
> 80: 65536 bytes 821 times --> 8219.87 Mbps in 60.83 usec
> 81: 65539 bytes 822 times --> 8232.40 Mbps in 60.74 usec
> 82: 98301 bytes 823 times --> 8717.21 Mbps in 86.03 usec
> 83: 98304 bytes 774 times --> 8716.08 Mbps in 86.05 usec
> 84: 98307 bytes 774 times --> 8714.26 Mbps in 86.07 usec
> 85: 131069 bytes 387 times --> 8921.59 Mbps in 112.09 usec
> 86: 131072 bytes 446 times --> 8935.37 Mbps in 111.91 usec
> 87: 131075 bytes 446 times --> 8925.47 Mbps in 112.04 usec
> 88: 196605 bytes 446 times --> 9195.80 Mbps in 163.12 usec
> 89: 196608 bytes 408 times --> 9197.41 Mbps in 163.09 usec
> 90: 196611 bytes 408 times --> 9204.33 Mbps in 162.97 usec
> 91: 262141 bytes 204 times --> 9344.95 Mbps in 214.02 usec
> 92: 262144 bytes 233 times --> 9347.58 Mbps in 213.96 usec
> 93: 262147 bytes 233 times --> 9340.56 Mbps in 214.12 usec
> 94: 393213 bytes 233 times --> 9473.27 Mbps in 316.68 usec
> 95: 393216 bytes 210 times --> 9486.24 Mbps in 316.25 usec
> 96: 393219 bytes 210 times --> 9500.26 Mbps in 315.78 usec
> 97: 524285 bytes 105 times --> 9538.88 Mbps in 419.33 usec
> 98: 524288 bytes 119 times --> 9543.40 Mbps in 419.14 usec
> 99: 524291 bytes 119 times --> 9534.73 Mbps in 419.52 usec
> 100: 786429 bytes 119 times --> 9574.15 Mbps in 626.69 usec
> 101: 786432 bytes 106 times --> 9565.70 Mbps in 627.24 usec
> 102: 786435 bytes 106 times --> 9544.50 Mbps in 628.64 usec
> 103: 1048573 bytes 53 times --> 9530.85 Mbps in 839.38 usec
> 104: 1048576 bytes 59 times --> 9525.24 Mbps in 839.87 usec
> 105: 1048579 bytes 59 times --> 9511.86 Mbps in 841.06 usec
> 106: 1572861 bytes 59 times --> 9391.40 Mbps in 1277.76 usec
> 107: 1572864 bytes 52 times --> 9395.54 Mbps in 1277.20 usec
> 108: 1572867 bytes 52 times --> 9386.02 Mbps in 1278.50 usec
> 109: 2097149 bytes 26 times --> 9298.48 Mbps in 1720.71 usec
> 110: 2097152 bytes 29 times --> 9313.43 Mbps in 1717.95 usec
> 111: 2097155 bytes 29 times --> 9293.49 Mbps in 1721.64 usec
> 112: 3145725 bytes 29 times --> 9126.67 Mbps in 2629.65 usec
> 113: 3145728 bytes 25 times --> 9113.76 Mbps in 2633.38 usec
> 114: 3145731 bytes 25 times --> 9079.90 Mbps in 2643.20 usec
> 115: 4194301 bytes 12 times --> 8810.57 Mbps in 3632.00 usec
> 116: 4194304 bytes 13 times --> 8821.99 Mbps in 3627.30 usec
> 117: 4194307 bytes 13 times --> 8801.17 Mbps in 3635.88 usec
> 118: 6291453 bytes 13 times --> 8337.50 Mbps in 5757.12 usec
> 119: 6291456 bytes 11 times --> 8332.94 Mbps in 5760.27 usec
> 120: 6291459 bytes 11 times --> 8346.25 Mbps in 5751.09 usec
> 121: 8388605 bytes 5 times --> 8159.20 Mbps in 7843.90 usec
> 122: 8388608 bytes 6 times --> 8166.83 Mbps in 7836.58 usec
> 123: 8388611 bytes 6 times --> 8161.26 Mbps in 7841.92 usec
> [6:37] svbu-mpi052:~/svn/ompi-tests/NetPIPE-3.7.1 % cd ../osu/
> [6:37] svbu-mpi052:~/svn/ompi-tests/osu % mpirun --mca
> mpi_paffinity_alone 1 -np 2 --mca btl sm,self osu_latency
> # OSU MPI Latency Test (Version 2.1)
> # Size Latency (us)
> 0 0.85
> 1 0.91
> 2 0.91
> 4 0.99
> 8 0.99
> 16 0.99
> 32 1.09
> 64 1.07
> 128 1.25
> 256 1.49
> 512 1.97
> 1024 2.69
> 2048 4.29
> 4096 6.83
> 8192 11.41
> 16384 19.69
> 32768 35.27
> 65536 61.06
> 131072 112.51
> 262144 215.47
> 524288 429.60
> 1048576 882.89
> 2097152 1836.45
> 4194304 3943.47
> [6:37] svbu-mpi052:~/svn/ompi-tests/osu %
>
> ================================================================
> r18850
> [6:31] svbu-mpi052:~/svn/ompi-tests/NetPIPE-3.7.1 % mpirun --mca
> mpi_paffinity_alone 1 -np 2 --mca btl sm,self NPmpi
> 0: svbu-mpi052
> 1: svbu-mpi052
> Now starting the main loop
> 0: 1 bytes 116185 times --> 11.32 Mbps in 0.67 usec
> 1: 2 bytes 148348 times --> 22.58 Mbps in 0.68 usec
> 2: 3 bytes 147969 times --> 33.88 Mbps in 0.68 usec
> 3: 4 bytes 98695 times --> 40.58 Mbps in 0.75 usec
> 4: 6 bytes 99737 times --> 60.85 Mbps in 0.75 usec
> 5: 8 bytes 66464 times --> 81.13 Mbps in 0.75 usec
> 6: 12 bytes 83076 times --> 121.58 Mbps in 0.75 usec
> 7: 13 bytes 55334 times --> 131.83 Mbps in 0.75 usec
> 8: 16 bytes 61344 times --> 161.81 Mbps in 0.75 usec
> 9: 19 bytes 74561 times --> 190.93 Mbps in 0.76 usec
> 10: 21 bytes 83186 times --> 207.97 Mbps in 0.77 usec
> 11: 24 bytes 86535 times --> 235.30 Mbps in 0.78 usec
> 12: 27 bytes 91024 times --> 241.36 Mbps in 0.85 usec
> 13: 29 bytes 52074 times --> 260.24 Mbps in 0.85 usec
> 14: 32 bytes 56782 times --> 286.57 Mbps in 0.85 usec
> 15: 35 bytes 62357 times --> 341.55 Mbps in 0.78 usec
> 16: 45 bytes 73090 times --> 400.53 Mbps in 0.86 usec
> 17: 48 bytes 77776 times --> 425.94 Mbps in 0.86 usec
> 18: 51 bytes 79963 times --> 449.27 Mbps in 0.87 usec
> 19: 61 bytes 45280 times --> 520.58 Mbps in 0.89 usec
> 20: 64 bytes 55011 times --> 589.77 Mbps in 0.83 usec
> 21: 67 bytes 62279 times --> 651.96 Mbps in 0.78 usec
> 22: 93 bytes 68530 times --> 706.75 Mbps in 1.00 usec
> 23: 96 bytes 66405 times --> 756.56 Mbps in 0.97 usec
> 24: 99 bytes 69940 times --> 786.11 Mbps in 0.96 usec
> 25: 125 bytes 37846 times --> 917.31 Mbps in 1.04 usec
> 26: 128 bytes 47708 times --> 991.21 Mbps in 0.99 usec
> 27: 131 bytes 51542 times --> 1030.40 Mbps in 0.97 usec
> 28: 189 bytes 53515 times --> 1228.14 Mbps in 1.17 usec
> 29: 192 bytes 56781 times --> 1317.94 Mbps in 1.11 usec
> 30: 195 bytes 60449 times --> 1372.28 Mbps in 1.08 usec
> 31: 253 bytes 32165 times --> 1506.60 Mbps in 1.28 usec
> 32: 256 bytes 38871 times --> 1590.08 Mbps in 1.23 usec
> 33: 259 bytes 41024 times --> 1657.90 Mbps in 1.19 usec
> 34: 381 bytes 42760 times --> 1894.98 Mbps in 1.53 usec
> 35: 384 bytes 43460 times --> 1958.92 Mbps in 1.50 usec
> 36: 387 bytes 44750 times --> 2029.44 Mbps in 1.45 usec
> 37: 509 bytes 23444 times --> 2176.96 Mbps in 1.78 usec
> 38: 512 bytes 27974 times --> 2268.97 Mbps in 1.72 usec
> 39: 515 bytes 29156 times --> 2340.62 Mbps in 1.68 usec
> 40: 765 bytes 30074 times --> 2698.17 Mbps in 2.16 usec
> 41: 768 bytes 30819 times --> 2778.48 Mbps in 2.11 usec
> 42: 771 bytes 31674 times --> 2847.11 Mbps in 2.07 usec
> 43: 1021 bytes 16322 times --> 3039.90 Mbps in 2.56 usec
> 44: 1024 bytes 19493 times --> 3161.06 Mbps in 2.47 usec
> 45: 1027 bytes 20270 times --> 3221.90 Mbps in 2.43 usec
> 46: 1533 bytes 20660 times --> 3455.95 Mbps in 3.38 usec
> 47: 1536 bytes 19698 times --> 3580.63 Mbps in 3.27 usec
> 48: 1539 bytes 20389 times --> 3623.40 Mbps in 3.24 usec
> 49: 2045 bytes 10346 times --> 3751.80 Mbps in 4.16 usec
> 50: 2048 bytes 12017 times --> 3833.40 Mbps in 4.08 usec
> 51: 2051 bytes 12278 times --> 3813.67 Mbps in 4.10 usec
> 52: 3069 bytes 12215 times --> 3997.25 Mbps in 5.86 usec
> 53: 3072 bytes 11381 times --> 4058.18 Mbps in 5.78 usec
> 54: 3075 bytes 11548 times --> 4102.09 Mbps in 5.72 usec
> 55: 4093 bytes 5845 times --> 4726.24 Mbps in 6.61 usec
> 56: 4096 bytes 7565 times --> 4679.74 Mbps in 6.68 usec
> 57: 4099 bytes 7491 times --> 4649.50 Mbps in 6.73 usec
> 58: 6141 bytes 7442 times --> 5072.39 Mbps in 9.24 usec
> 59: 6144 bytes 7217 times --> 5064.70 Mbps in 9.26 usec
> 60: 6147 bytes 7204 times --> 5067.07 Mbps in 9.26 usec
> 61: 8189 bytes 3606 times --> 5387.85 Mbps in 11.60 usec
> 62: 8192 bytes 4311 times --> 5393.87 Mbps in 11.59 usec
> 63: 8195 bytes 4316 times --> 5301.81 Mbps in 11.79 usec
> 64: 12285 bytes 4242 times --> 6568.81 Mbps in 14.27 usec
> 65: 12288 bytes 4672 times --> 6561.90 Mbps in 14.29 usec
> 66: 12291 bytes 4666 times --> 6548.01 Mbps in 14.32 usec
> 67: 16381 bytes 2329 times --> 6662.43 Mbps in 18.76 usec
> 68: 16384 bytes 2665 times --> 6655.18 Mbps in 18.78 usec
> 69: 16387 bytes 2662 times --> 6634.79 Mbps in 18.84 usec
> 70: 24573 bytes 2654 times --> 6937.26 Mbps in 27.02 usec
> 71: 24576 bytes 2466 times --> 6937.41 Mbps in 27.03 usec
> 72: 24579 bytes 2466 times --> 6931.40 Mbps in 27.05 usec
> 73: 32765 bytes 1232 times --> 7218.55 Mbps in 34.63 usec
> 74: 32768 bytes 1443 times --> 7213.85 Mbps in 34.66 usec
> 75: 32771 bytes 1442 times --> 7218.89 Mbps in 34.63 usec
> 76: 49149 bytes 1443 times --> 8387.79 Mbps in 44.71 usec
> 77: 49152 bytes 1491 times --> 8385.50 Mbps in 44.72 usec
> 78: 49155 bytes 1490 times --> 8390.79 Mbps in 44.69 usec
> 79: 65533 bytes 745 times --> 8261.32 Mbps in 60.52 usec
> 80: 65536 bytes 826 times --> 8260.34 Mbps in 60.53 usec
> 81: 65539 bytes 826 times --> 8265.33 Mbps in 60.50 usec
> 82: 98301 bytes 826 times --> 8747.13 Mbps in 85.74 usec
> 83: 98304 bytes 777 times --> 8746.72 Mbps in 85.75 usec
> 84: 98307 bytes 777 times --> 8733.81 Mbps in 85.88 usec
> 85: 131069 bytes 388 times --> 8956.71 Mbps in 111.65 usec
> 86: 131072 bytes 447 times --> 8967.16 Mbps in 111.52 usec
> 87: 131075 bytes 448 times --> 8960.56 Mbps in 111.60 usec
> 88: 196605 bytes 448 times --> 9247.58 Mbps in 162.20 usec
> 89: 196608 bytes 411 times --> 9234.30 Mbps in 162.44 usec
> 90: 196611 bytes 410 times --> 9231.32 Mbps in 162.49 usec
> 91: 262141 bytes 205 times --> 9365.98 Mbps in 213.54 usec
> 92: 262144 bytes 234 times --> 9368.25 Mbps in 213.49 usec
> 93: 262147 bytes 234 times --> 9363.09 Mbps in 213.61 usec
> 94: 393213 bytes 234 times --> 9512.63 Mbps in 315.37 usec
> 95: 393216 bytes 211 times --> 9497.01 Mbps in 315.89 usec
> 96: 393219 bytes 211 times --> 9510.80 Mbps in 315.43 usec
> 97: 524285 bytes 105 times --> 9553.55 Mbps in 418.69 usec
> 98: 524288 bytes 119 times --> 9561.59 Mbps in 418.34 usec
> 99: 524291 bytes 119 times --> 9551.86 Mbps in 418.77 usec
> 100: 786429 bytes 119 times --> 9582.63 Mbps in 626.13 usec
> 101: 786432 bytes 106 times --> 9576.72 Mbps in 626.52 usec
> 102: 786435 bytes 106 times --> 9584.78 Mbps in 625.99 usec
> 103: 1048573 bytes 53 times --> 9545.32 Mbps in 838.10 usec
> 104: 1048576 bytes 59 times --> 9532.37 Mbps in 839.25 usec
> 105: 1048579 bytes 59 times --> 9542.90 Mbps in 838.32 usec
> 106: 1572861 bytes 59 times --> 9434.44 Mbps in 1271.93 usec
> 107: 1572864 bytes 52 times --> 9400.64 Mbps in 1276.51 usec
> 108: 1572867 bytes 52 times --> 9409.24 Mbps in 1275.34 usec
> 109: 2097149 bytes 26 times --> 9305.75 Mbps in 1719.36 usec
> 110: 2097152 bytes 29 times --> 9314.56 Mbps in 1717.74 usec
> 111: 2097155 bytes 29 times --> 9278.43 Mbps in 1724.43 usec
> 112: 3145725 bytes 28 times --> 9065.15 Mbps in 2647.50 usec
> 113: 3145728 bytes 25 times --> 9095.10 Mbps in 2638.78 usec
> 114: 3145731 bytes 25 times --> 9073.88 Mbps in 2644.96 usec
> 115: 4194301 bytes 12 times --> 8772.63 Mbps in 3647.70 usec
> 116: 4194304 bytes 13 times --> 8768.32 Mbps in 3649.50 usec
> 117: 4194307 bytes 13 times --> 8771.37 Mbps in 3648.24 usec
> 118: 6291453 bytes 13 times --> 8321.22 Mbps in 5768.38 usec
> 119: 6291456 bytes 11 times --> 8320.00 Mbps in 5769.23 usec
> 120: 6291459 bytes 11 times --> 8335.25 Mbps in 5758.68 usec
> 121: 8388605 bytes 5 times --> 8167.02 Mbps in 7836.39 usec
> 122: 8388608 bytes 6 times --> 8165.44 Mbps in 7837.91 usec
> 123: 8388611 bytes 6 times --> 8162.24 Mbps in 7840.99 usec
> [6:32] svbu-mpi052:~/svn/ompi-tests/NetPIPE-3.7.1 % cd ../osu/
> [6:32] svbu-mpi052:~/svn/ompi-tests/osu % mpirun --mca
> mpi_paffinity_alone 1 -np 2 --mca btl sm,self osu_latency
> # OSU MPI Latency Test (Version 2.1)
> # Size Latency (us)
> 0 0.65
> 1 0.69
> 2 0.69
> 4 0.76
> 8 0.76
> 16 0.76
> 32 0.85
> 64 0.83
> 128 1.03
> 256 1.25
> 512 1.73
> 1024 2.47
> 2048 4.18
> 4096 6.53
> 8192 11.23
> 16384 18.91
> 32768 34.97
> 65536 60.80
> 131072 112.09
> 262144 215.15
> 524288 427.97
> 1048576 880.90
> 2097152 1840.40
> 4194304 3945.23
> [6:33] svbu-mpi052:~/svn/ompi-tests/osu %
>
>
>
> On Jul 23, 2008, at 7:24 AM, Lenny Verkhovsky wrote:
>
>> Sorry Terry, :).
>>
>> ---------- Forwarded message ----------
>> From: Lenny Verkhovsky <lenny.verkhovsky_at_[hidden]>
>> Date: Jul 23, 2008 2:22 PM
>> Subject: Re: [OMPI devel] [OMPI bugs] [Open MPI] #1250: Performance
>> problem on SM
>> To: Lenny Berkhovsky <lenny.verkhovsky_at_[hidden]>
>>
>>
>>
>> On 7/23/08, Terry Dontje <Terry.Dontje_at_[hidden]> wrote: I didn't see
>> any attached results on the email.
>>
>> --td
>> Lenny Verkhovsky wrote:
>>
>> I rechecked in on the same node, still no degradation,
>>
>> see results attached.
>>
>>
>> On 7/22/08, *Open MPI* <bugs_at_[hidden] <mailto:bugs_at_open-
>> mpi.org>> wrote:
>>
>> #1250: Performance problem on SM
>> --------------------
>> +-------------------------------------------------------
>> Reporter: bosilca | Owner: bosilca
>> Type: defect | Status: assigned
>> Priority: blocker | Milestone: Open MPI 1.3
>> Version: | Resolution:
>> Keywords: |
>> --------------------
>> +-------------------------------------------------------
>>
>>
>> Comment(by tdd):
>>
>> Hmmm, Lennyve isn't your mpirun above going across nodes and not
>> on the
>> same node? I am running netpipe on a single node.
>>
>>
>> --
>> Ticket URL:
>> <https://svn.open-mpi.org/trac/ompi/ticket/1250#comment:20>
>>
>> Open MPI <http://www.open-mpi.org/>
>>
>>
>> _______________________________________________
>> bugs mailing list
>> bugs_at_[hidden] <mailto:bugs_at_[hidden]>
>> http://www.open-mpi.org/mailman/listinfo.cgi/bugs
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>>
>>
>> <NPmpi.log>_______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> --
> Jeff Squyres
> Cisco Systems
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel



  • application/pkcs7-signature attachment: smime.p7s