Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Strange TCP latency results on Amazon EC2
From: Roberto Rey (eros.83_at_[hidden])
Date: 2012-01-12 10:28:08


Thanks for your reply!

I'm using TCP BTL because I don't have any other option in Amazon with 10
Gbit Ethernet.

I also tried with MPICH2 1.4 and I got 60 microseconds...so I am very
confused about it...

Regarding hyperthreading and process binding settings...I am using only one
MPI process in each node (2 nodes for a clasical ping-pong latency
benchmark). I don't know how it could affect on this test...but I could try
anything that anyone suggest to me

2012/1/12 Jeff Squyres <jsquyres_at_[hidden]>

> Hi Roberto.
>
> We've had strange reports of performance from EC2 before; it's actually
> been on my to-do list to go check this out in detail. I made contact with
> the EC2 folks at Supercomputing late last year. They've hooked me up with
> some credits on EC2 to go check out what's happening, but the pent-up email
> deluge from the Christmas vacation and my travel to the MPI Forum this week
> prevented me from testing yet.
>
> I hope to be able to get time to test Open MPI on EC2 next week and see
> what's going on.
>
> It's very strange to me that Open MPI is getting *better* than raw TCP
> performance. I don't have an immediate explanation for that -- if you're
> using the TCP BTL, then OMPI should be using TCP sockets, just like netpipe
> and the others.
>
> You *might* want to check hyperthreading and process binding settings in
> all your tests.
>
>
> On Jan 12, 2012, at 7:04 AM, Roberto Rey wrote:
>
> > Hi again,
> >
> > Today I was trying with another TCP benchmark included in the hpcbench
> suite, and with a ping-pong test I'm also getting 100us of latency. Then, I
> tried with netperf and the same result....
> >
> > So, in summary, I'm measuring TCP latency with messages size between
> 1-32 bytes:
> >
> > Netperf over TCP -> 100us
> > Netpipe over TCP (NPtcp) -> 100us
> > HPCbench over TCP -> 100us
> > Netpipe over OpenMPI (NPmpi) -> 60us
> > HPCBench over OpenMPI -> 60us
> >
> > Any clues?
> >
> > Thanks a lot!
> >
> > 2012/1/10 Roberto Rey <eros.83_at_[hidden]>
> > Hi,
> >
> > I'm running some tests on EC2 cluster instances with 10 Gigabit Ethernet
> hardware and I'm getting strange latency results with Netpipe and OpenMPI.
> >
> > If I run Netpipe over OpenMPI (NPmpi) I get a network latency around 60
> microseconds for small messages (less than 2kbytes). However, when I run
> Netpipe over TCP (NPtcp) I always get around 100 microseconds. For bigger
> messages everything seems to be OK.
> >
> > I'm using the BTL TCP in OpenMPI, so I can't understand why OpenMPI
> outperforms raw TCP performance for small messages (40us of difference). I
> also have run the PingPong test from the Intel Media Benchmarks and the
> latency results for OpenMPI are very similar (60us) to those obtained with
> NPmpi
> >
> > Can OpenMPI outperform Netpipe over TCP? Why? Is OpenMPI doing any
> optimization in BTL TCP?
> >
> > The results for OpenMPI aren't so good but we must take into account the
> network virtualization overhead under Xen
> >
> > Thanks for your reply
> >
> >
> >
> > --
> > Roberto Rey Expósito
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> jsquyres_at_[hidden]
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

-- 
Roberto Rey Expósito