What I think is happening is this:
The initial transfer rate you are seeing is the burst rate; after a long
time average, your sustained transfer rate emerges. Like George said, you
should use a proven tool to measure your bandwidth. We use netperf, a
freeware from HP.
That said, the ethernet technology is not a good candidate for HPC (one
reason people don't use it in the backplanes, despite the low cost). Do the
math yourself: there is a 54 byte overhead (14 B ethernet + 20B IP + 20B
TCP) for every packet sent, for socket communication. That is why protocols
like uDAPL over Infiniband is gaining in popularity.
On 10/23/06, Jayanta Roy <jroy_at_[hidden]> wrote:
> I have tried with lamboot with a host file where odd-even nodes will talk
> within themselves using eth0 and talk across them using eth1. So my
> transfer runs @ 230MB/s at starting. But after few transfers rate falls
> down to ~130MB/s and after long run finally comes to ~54MB/s. Why this
> type of network slowing down with time is happenning?
> On Mon, 23 Oct 2006, Durga Choudhury wrote:
> > Did you try channel bonding? If your OS is Linux, there are plenty of
> > "howto" on the internet which will tell you how to do it.
> > However, your CPU might be the bottleneck in this case. How much of CPU
> > horsepower is available at 140MB/s?
> > If the CPU *is* the bottleneck, changing your network driver (e.g. from
> > interrupt-based to poll-based packet transfer) might help. If you are
> > unfamiliar with writing network drivers for your OS, this may not be a
> > trivial task, though.
> > Oh, and like I pointed out last time, if all of the above seem OK, try
> > putting your second link to a separate PC and see if you can gate twice
> > throughput. If so, then the ECMP implementation of your IP stack is what
> > causing the problem. This is the hardest one to fix. You could rewrite a
> > routines in ipv4 processing and recompile the Kernel, if you are
> > with Kernel building and your OS is Linux.
> > On 10/23/06, Jayanta Roy <jroy_at_[hidden]> wrote:
> >> Hi,
> >> Sometime before I have posted doubts about using dual gigabit support
> >> fully. See I get ~140MB/s full duplex transfer rate in each of
> >> runs.....
> >> mpirun --mca btl_tcp_if_include eth0 -n 4 -bynode -hostfile host a.out
> >> mpirun --mca btl_tcp_if_include eth1 -n 4 -bynode -hostfile host a.out
> >> How to combine these two port or use a proper routing table in place
> >> file? I am using openmpi-1.1 version.
> >> -Jayanta
> >> _______________________________________________
> >> users mailing list
> >> users_at_[hidden]
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> > --
> > Devil wanted omnipresence;
> > He therefore created communists.
> Jayanta Roy
> National Centre for Radio Astrophysics | Phone : +91-20-25697107
> Tata Institute of Fundamental Research | Fax : +91-20-25692149 Pune
> University Campus, Pune 411 007 | e-mail : jroy_at_[hidden]
> users mailing list
Devil wanted omnipresence;
He therefore created communists.