On Mon, Aug 29, 2011 at 3:51 AM, Xin He
<xin.i.he@ericsson.com>
wrote:
On 08/25/2011 03:14 PM, Jeff Squyres wrote:
On Aug 25, 2011, at 8:25 AM, Xin He wrote:
Can you edit your configure.m4 directly and test it
and whatnot? I provided the configure.m4 as a
starting point for you. :-) It shouldn't be hard to
make it check linux/tipc.h instead of tipc.h. I'm
happy to give you direct write access to the
bitbucket, if you want it.
I think me having write access is convenient for both of
us :)
Sure -- what's your bitbucket account ID?
It's "letter113"
As we've discussed off-list, we can't take the code
upstream until the contributor agreement is signed,
unfortunately.
The agreement thing is ongoing right now, but it may
take some time.
No worries. Lawyers tend to take time when reviewing this
stuff; we've seen this pattern in most organizations who
sign the OMPI agreement.
But to save time, can you guys do some test on TIPC BTL,
so that
when the agreement is ready, the code can be used?
I don't know if any of us has the TIPC support libraries
installed.
It is easy to have TIPC support. It is within the kernel
actually. To get TIPC working, you only have to configure it
by using "tipc-config". Maybe you
can check this doc for information: http://tipc.sourceforge.net/doc/Users_Guide.txt
So... what *is* TIPC? Is there a writeup anywhere that we
can read about what it is / how it works? For example,
what makes TIPC perform better than TCP?
Sure. Search "TIPC: Providing Communication for Linux
Clusters". It is a paper written by the author of TIPC,
explaining basic stuff about TIPC,
should be very useful. And you can visit TIPC homepage: http://tipc.sourceforge.net/ .
I have done some tests using tools like NetPIPE,
osu and IMB and the result shows that TIPC BTL has
a better performance
than TCP BTL.
Great! Can you share any results?
Yes, please check the appendix for the results using
IMB 3.2.
I have done the tests on 2 computers. Dell SC1435
Dual-Core AMD Opteron(tm) Processor 2212 HE x 2
4 GB Mem
Ubuntu Server 10.04 LTS 32-bit Linux 2.6.32-24
I'm not familiar with the Dell or Opteron lines -- how
recent are those models?
I ask because your TCP latency is a bit high (about 85us
in 2-process IMB PingPong); it might suggest older
hardware. Or you may have built a debugging version of
Open MPI (if you have a .svn or .hg checkout, that's the
default). See the HACKING top-level file for how to get
an optimized build.
For example, with my debug build of Open MPI on fairly
old Xeons with 1GB ethernet, I'm getting the following
PingPong results (note: this is a debug build; it's not
even an optimized build):
-----
$ mpirun --mca btl tcp,self --bynode -np 2 --mca
btl_tcp_if_include eth0 hostname
svbu-mpi008
svbu-mpi009
$ mpirun --mca btl tcp,self --bynode -np 2 --mca
btl_tcp_if_include eth0 IMB-MPI1 PingPong
#---------------------------------------------------
# Intel (R) MPI Benchmark Suite V3.2, MPI-1 part
#---------------------------------------------------
...
#---------------------------------------------------
# Benchmarking PingPong
# #processes = 2
#---------------------------------------------------
#bytes #repetitions t[usec] Mbytes/sec
0 1000 57.31 0.00
1 1000 57.71 0.02
2 1000 57.73 0.03
4 1000 57.81 0.07
8 1000 57.78 0.13
-----
With an optimized build, it shaves off only a few us
(which isn't too important in this case, but it does
matter in the low-latency transport cases):
-----
#---------------------------------------------------
# Benchmarking PingPong
# #processes = 2
#---------------------------------------------------
#bytes #repetitions t[usec] Mbytes/sec
0 1000 54.62 0.00
1 1000 54.92 0.02
2 1000 55.15 0.03
4 1000 55.16 0.07
8 1000 55.15 0.14
-----
Hi, I think these models are reasonably new :)
The result I gave you, they are tested on 2 processes but on 2
different servers. I get that the result you showed is 2
processes on one machine?
But I did build with debug enabled, I will try optimize then
:)
BTW, I forgot to tell you about SM & TIPC. Unfortunately,
TIPC does not beat SM...
/Xin