Hi

 

Do you have any details about the performance of mxm, e.g. for real applications?

 

Thanks

 

Henk

 

From: users-bounces@open-mpi.org [mailto:users-bounces@open-mpi.org] On Behalf Of Mike Dubman
Sent: 11 May 2012 19:23
To: Open MPI Users
Subject: Re: [OMPI users] ompi mca mxm version

 

ob1/openib is RC based which have scalability issues, mxm 1.1 is ud based and kicks in at scale.

We observe mxm outperforms ob1 on 8+ nodes.

 

We will update docs as you mentioned, thanks

 

Regards

 



 

On Thu, May 10, 2012 at 4:30 PM, Derek Gerstmann <derek.gerstmann@uwa.edu.au> wrote:

On May 9, 2012, at 7:41 PM, Mike Dubman wrote:

> you need latest OMPI 1.6.x and latest MXM (ftp://bgate.mellanox.com/hpc/mxm/v1.1/mxm_1.1.1067.tar)

Excellent!  Thanks for the quick response!  Using the MXM v1.1.1067 against OMPI v1.6.x did the trick.  Please (!!!) add a note to the docs for OMPI 1.6.x to help out other users -- there's zero mention of this anywhere that I could find from scouring the archives and source code.

Sadly, performance isn't what we'd expect.  OB1 is outperforming CM MXM (consistently).

Are there any suggested configuration settings?  We tried all the obvious ones listed in the OMPI Wiki and mailing list archives, but few have had much of an effect.

We seem to do better with the OB1 openib btl, than the lower level CM MXM.  Any suggestions?

Here's numbers from the OSU MicroBenchmarks (for the MBW_MR test) running on 2x pairs, aka 4 separate hosts, each using Mellanox ConnectX, one card per host, single port, single switch):

-- OB1
> /opt/openmpi/1.6.0/bin/mpiexec -np 4 --mca pml ob1 --mca btl ^tcp --mca mpi_use_pinned 1 -hostfile all_hosts ./osu-micro-benchmarks/osu_mbw_mr
# OSU MPI Multiple Bandwidth / Message Rate Test v3.6
# [ pairs: 2 ] [ window size: 64 ]
# Size                  MB/s        Messages/s
1                       2.91        2909711.73
2                       5.97        2984274.11
4                      11.70        2924292.78
8                      23.00        2874502.93
16                     44.75        2796639.64
32                     89.49        2796639.64
64                    175.98        2749658.96
128                   292.41        2284459.86
256                   527.84        2061874.61
512                   961.65        1878221.77
1024                 1669.06        1629943.87
2048                 2220.43        1084193.45
4096                 2906.57         709611.68
8192                 3017.65         368365.70
16384                5225.97         318967.95
32768                5418.98         165374.23
65536                5998.07          91523.27
131072               6031.69          46018.16
262144               6063.38          23129.97
524288               5971.77          11390.24
1048576              5788.75           5520.59
2097152              5791.39           2761.55
4194304              5820.60           1387.74

-- MXM
> /opt/openmpi/1.6.0/bin/mpiexec -np 4 --mca pml cm --mca mtl mxm --mca btl ^tcp --mca mpi_use_pinned 1 -hostfile all_hosts ./osu-micro-benchmarks/osu_mbw_mr
# OSU MPI Multiple Bandwidth / Message Rate Test v3.6
# [ pairs: 2 ] [ window size: 64 ]
# Size                  MB/s        Messages/s
1                       2.07        2074863.43
2                       4.14        2067830.81
4                      10.57        2642471.39
8                      23.16        2895275.37
16                     38.73        2420627.22
32                     66.77        2086718.41
64                    147.87        2310414.05
128                   284.94        2226109.85
256                   537.27        2098709.64
512                  1041.91        2034989.43
1024                 1930.93        1885676.34
2048                 1998.68         975916.00
4096                 2880.72         703299.77
8192                 3608.45         440484.17
16384                4027.15         245797.51
32768                4464.85         136256.47
65536                4594.22          70102.23
131072               4655.62          35519.55
262144               4671.56          17820.58
524288               4604.16           8781.74
1048576              4635.51           4420.77
2097152              3575.17           1704.78
4194304              2828.19            674.29


Thanks!

-[dg]

Derek Gerstmann, PhD Student
The University of Western Australia (UWA)

w: http://local.ivec.uwa.edu.au/~derek
e: derek.gerstmann [at] icrar.org

On May 9, 2012, at 7:41 PM, Mike Dubman wrote:

> you need latest OMPI 1.6.x and latest MXM (ftp://bgate.mellanox.com/hpc/mxm/v1.1/mxm_1.1.1067.tar)
>
>
>
> On Wed, May 9, 2012 at 6:02 AM, Derek Gerstmann <derek.gerstmann@uwa.edu.au> wrote:
> What versions of OpenMPI and the Mellanox MXM libraries have been tested and verified to work?
>
> We are currently trying to build OpenMPI v1.5.5 against the MXM 1.0.601 (included in the MLNX_OFED_LINUX-1.5.3-3.0.0 distribution) and are getting build errors.
>
> Specifically, there's a single undefined type (mxm_wait_t) being used in the OpenMPI tree:
>
>       openmpi-1.5.5/ompi/mca/mtl/mxm/mtl_mxm_send.c:44        mxm_wait_t wait;
>
> There is no mxm_wait_t defined anywhere in the current MXM API (/opt/mellanox/mxm/include/mxm/api), which suggests a version mismatch.
>
> The OpenMPI v1.6 branch has a note in the readme saying "Minor Fixes for Mellanox MXM" were added, but the same undefined mxm_wait_t is still being used.
>
> What versions of OpenMPI and MXM are verified to work?
>
> Thanks!
>
> -[dg]
>
> Derek Gerstmann, PhD Student
> The University of Western Australia (UWA)
>
> w: http://local.ivec.uwa.edu.au/~derek
> e: derek.gerstmann [at] icrar.org
>
>
>
>
> _______________________________________________
> users mailing list
> users@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



-[dg]

Derek Gerstmann, PhD Student
The University of Western Australia (UWA)

w: http://local.ivec.uwa.edu.au/~derek
e: derek.gerstmann [at] icrar.org




_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users