Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] tcp of openmpi-1.7.3 under our environment is very slow
From: tmishima_at_[hidden]
Date: 2013-12-18 18:47:48


Hi Jeff,

I did with processor binding enabled using both of openmpi-1.7.3
and 1.7.4rc1. But I got the same results as no binding.

In addition, core mapping of 1.7.4rc1 seems to be strange, which
has no relation with tcp slowdown.

Regards,
Tetsuya Mishima

[mishima_at_node08 OMB-3.1.1]$ mpirun -V
mpirun (Open MPI) 1.7.3

Report bugs to http://www.open-mpi.org/community/help/
[mishima_at_node08 OMB-3.1.1]$ mpirun -np 2 -host node08,node09 -mca btl
^openib -bind-to core -report-bindings osu_bw
[node08.cluster:23950] MCW rank 0 bound to socket 0[core 0[hwt 0]]:
[B/././.][./././.]
[node09.cluster:23477] MCW rank 1 bound to socket 0[core 0[hwt 0]]:
[B/././.][./././.]
# OSU MPI Bandwidth Test v3.1.1
# Size Bandwidth (MB/s)
1 0.00
2 0.01
4 0.01
8 0.02
16 0.05
32 0.09
64 6.49
128 0.39
256 1.74
512 9.51
1024 26.59
2048 182.55
4096 202.52
8192 217.44
16384 227.91
32768 231.11
65536 112.57
131072 217.01
262144 215.49
524288 233.97
1048576 231.33
2097152 235.04
4194304 234.77
[mishima_at_node08 OMB-3.1.1]$ mpirun -np 2 -host node08,node09 -mca btl
^openib -bind-to core -report-bindings osu_latency
[node08.cluster:23968] MCW rank 0 bound to socket 0[core 0[hwt 0]]:
[B/././.][./././.]
[node09.cluster:23522] MCW rank 1 bound to socket 0[core 0[hwt 0]]:
[B/././.][./././.]
# OSU MPI Latency Test v3.1.1
# Size Latency (us)
0 18.08
1 18.46
2 18.37
4 18.45
8 18.96
16 18.98
32 19.31
64 19.83
128 20.24
256 21.86
512 24.74
1024 30.02
2048 71.07
4096 73.64
8192 106.67
16384 176.36
32768 250.88
65536 20188.73
131072 21141.11
262144 18462.47
524288 24940.10
1048576 26160.76
2097152 29538.91
4194304 42420.03

[mishima_at_node08 OMB-3.1.1]$ mpirun -V
mpirun (Open MPI) 1.7.4rc1

Report bugs to http://www.open-mpi.org/community/help/
[mishima_at_node08 OMB-3.1.1]$ mpirun -np 2 -host node08,node09 -mca btl
^openib -bind-to core -report-bindings osu_bw
[node08.cluster:23932] MCW rank 0 bound to socket 0[core 0[hwt 0]]:
[B/././.][./././.]
[node09.cluster:23409] MCW rank 1 bound to socket 0[core 1[hwt 0]]:
[./B/./.][./././.]
# OSU MPI Bandwidth Test v3.1.1
# Size Bandwidth (MB/s)
1 0.00
2 0.01
4 0.01
8 0.03
16 0.05
32 0.08
64 6.35
128 0.34
256 3.79
512 8.38
1024 9.53
2048 182.12
4096 203.16
8192 215.49
16384 228.56
32768 231.28
65536 134.46
131072 217.33
262144 226.90
524288 220.98
1048576 234.73
2097152 232.56
4194304 234.78
[mishima_at_node08 OMB-3.1.1]$ mpirun -np 2 -host node08,node09 -mca btl
^openib -bind-to core -report-bindings osu_latency
[node08.cluster:23940] MCW rank 0 bound to socket 0[core 0[hwt 0]]:
[B/././.][./././.]
[node09.cluster:23443] MCW rank 1 bound to socket 0[core 1[hwt 0]]:
[./B/./.][./././.]
# OSU MPI Latency Test v3.1.1
# Size Latency (us)
0 19.99
1 19.79
2 19.87
4 20.04
8 19.99
16 20.00
32 20.12
64 20.85
128 21.27
256 22.73
512 25.57
1024 31.25
2048 41.68
4096 56.41
8192 90.48
16384 177.76
32768 252.26
65536 20489.12
131072 21235.08
262144 20278.82
524288 24009.70
1048576 25395.96
2097152 30260.70
4194304 41058.17

> Can you re-run these tests with processor binding enabled?
>
> On Dec 16, 2013, at 6:36 PM, tmishima_at_[hidden] wrote:
>
> >
> >
> > Hi,
> >
> > I usually use infiniband network, where openmpi-1.7.3 and 1.6.5 works
fine.
> >
> > The other days, I had a chance to use tcp network(1GbE) and I noticed
that
> > my application with openmpi-1.7.3 was quite slower than openmpi-1.6.5.
> > So, I did OSU MPI Bandwidth Test v3.1.1 as shown below, which shows
> > bandwidth for smaller size(< 1024) is very slow compared with 1.6.5.
> > In addition, the latency for larger size( >65536 ) seems to be strange.
> >
> > Does this depend on our local environment or some mca parameter would
be
> > necesarry? I'm afraid that something is wrong with tcp of
openmpi-1.7.3.
> >
> > openmpi-1.7.3:
> >
> > [mishima_at_node07 OMB-3.1.1]$ mpirun -np 2 -host node07,node08 -mca tbl
> > ^openib osu_bw
> > # OSU MPI Bandwidth Test v3.1.1
> > # Size Bandwidth (MB/s)
> > 1 0.00
> > 2 0.01
> > 4 0.01
> > 8 0.03
> > 16 0.05
> > 32 0.10
> > 64 0.32
> > 128 0.37
> > 256 0.87
> > 512 5.97
> > 1024 20.00
> > 2048 182.87
> > 4096 202.53
> > 8192 215.14
> > 16384 225.16
> > 32768 228.58
> > 65536 115.23
> > 131072 198.24
> > 262144 193.38
> > 524288 233.03
> > 1048576 227.31
> > 2097152 233.07
> > 4194304 233.25
> >
> > [mishima_at_node07 OMB-3.1.1]$ mpirun -np 2 -host node07,node08 -mca btl
> > ^openib osu_latency
> > # OSU MPI Latency Test v3.1.1
> > # Size Latency (us)
> > 0 19.23
> > 1 19.57
> > 2 19.52
> > 4 19.88
> > 8 20.44
> > 16 20.38
> > 32 20.78
> > 64 21.14
> > 128 21.75
> > 256 23.20
> > 512 26.12
> > 1024 31.54
> > 2048 41.72
> > 4096 64.55
> > 8192 107.52
> > 16384 179.23
> > 32768 251.58
> > 65536 20689.68
> > 131072 21179.79
> > 262144 20168.56
> > 524288 22984.83
> > 1048576 25994.54
> > 2097152 30929.55
> > 4194304 38028.48
> >
> > openmpi-1.6.5:
> >
> > [mishima_at_node07 OMB-3.1.1]$ mpirun -np 2 -host node07,node08 -mca tbl
> > ^openib osu_bw
> > # OSU MPI Bandwidth Test v3.1.1
> > # Size Bandwidth (MB/s)
> > 1 0.22
> > 2 0.45
> > 4 0.89
> > 8 1.77
> > 16 3.57
> > 32 7.15
> > 64 14.28
> > 128 28.58
> > 256 57.17
> > 512 96.44
> > 1024 152.38
> > 2048 182.84
> > 4096 203.17
> > 8192 215.13
> > 16384 225.05
> > 32768 100.58
> > 65536 225.24
> > 131072 182.92
> > 262144 192.82
> > 524288 212.92
> > 1048576 233.35
> > 2097152 233.72
> > 4194304 233.89
> >
> > [mishima_at_node07 OMB-3.1.1]$ mpirun -np 2 -host node07,node08 -mca btl
> > ^openib osu_latency
> > # OSU MPI Latency Test v3.1.1
> > # Size Latency (us)
> > 0 17.24
> > 1 17.30
> > 2 17.29
> > 4 17.30
> > 8 24.32
> > 16 17.24
> > 32 17.80
> > 64 17.91
> > 128 19.08
> > 256 20.81
> > 512 22.83
> > 1024 27.82
> > 2048 39.54
> > 4096 52.66
> > 8192 97.70
> > 16384 143.23
> > 32768 215.02
> > 65536 481.08
> > 131072 800.64
> > 262144 1475.12
> > 524288 2698.62
> > 1048576 4992.31
> > 2097152 9558.96
> > 4194304 20801.50
> >
> > Regards,
> > Tetsuya Mishima
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> jsquyres_at_[hidden]
> For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users