On Tue, Jan 10, 2012 at 10:02 AM, Roberto Rey <eros.83_at_[hidden]> wrote:
> I'm running some tests on EC2 cluster instances with 10 Gigabit Ethernet
> hardware and I'm getting strange latency results with Netpipe and OpenMPI.
- There are 3 types of instances that can use 10 GbE. Are you using
"cc1.4xlarge", "cc2.8xlarge", or "cg1.4xlarge"??
- Did you set up a placement group??
- Also, which AMI are you using??
> I'm using the BTL TCP in OpenMPI, so I can't understand why OpenMPI
> outperforms raw TCP performance for small messages (40us of difference).
> Can OpenMPI outperform Netpipe over TCP? Why? Is OpenMPI doing any
> optimization in BTL TCP?
It is indeed interesting!
If we can run strace with timing (like strace -tt) and compare the
difference between NPmpi & NPtcp, then we can get a better idea on
It is possible that one is doing more busy polling than another,
and/or triggering Xen to handle things a bit differently. Also, we
should check the socket options, and also check the system call
latency to see if the network is really accountable for the extra 40us
> The results for OpenMPI aren't so good but we must take into account the
> network virtualization overhead under Xen
If you are running Cluster Compute Instances, then you are using HVM.
If things are setup properly (HVM & placement group), then you can
even get a Top500 computer on EC2... Amazon uses similar setups for
their TOP500 submission:
Open Grid Scheduler / Grid Engine
Scalable Grid Engine Support Program