Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Tony Ladd (ladd_at_[hidden])
Date: 2006-10-24 09:35:25


I use my own network testing program; I wrote it some time ago because
Netpipe only tested 1-way rates at that point. I havent tried IMB but I
looked at the source and its very similar to what I do. 1) set up buffers
with data. 2) Start clock 3) Call MPI_xxx N times 4) Stop clock 5) calculate
rate. IMB tests more things than I do; I just focused on the calls I use
(send recv allreduce). I have done a lot of testing of hardware and
software. I will have some web pages posted soon. I will put a note here
when I do. But a couple of things.
A) I have found the switch is the biggest discriminant if you want to run
HPC under Gigabit ethernet. Most GigE switches choke when all the ports are
being used at once. This is the usual HPC pattern, but not of a typical
network, which is what these switches are geared towards. The one exception
I have found is the Extreme Networks x450a-48t. In some test patterns I
found it to be 500 times faster (not a typo) than the s400-48t, which is its
predecessor. I have tested several GigE switches (Extreme, Force10, HP,
Asante) and the x450 is the only one that copes with high traffic loads in
all port configurations. Its expensive for a GigE switch (~$6500) but worth
it in my opinion if you want to do HPC. Its still much cheaper than
B) You have to test the switch in different port configurations-a random
ring of SendRecv is good for this. I don't think IMB has it in its test
suite but its easy to program. Or you can change the order of nodes in the
machinefile to force unfavorable port assignments. A step of 12 is a good
test since many GigE switches use 12-port ASICS and this forces all the
traffic onto the backplane. On the Summit 400 this causes it to more or less
stop working-rates drop to a few Kbytes/sec along each wire, but the x450
has no problem with the same test. You need to know how your nodes are wired
to the switch to do this test.
C) GAMMA is an extraordinary accomplishment in my view; in a number of tests
with codes like DLPOLY, GROMACS, VASP it can be 2-3 times the speed of TCP
based programs with 64 cpus. In many instances I get comparable (and
occasionally better) scaling than with the university HPC system which has
an Infiniband interconnect. Note I am not saying GigE is comparable to IB;
but that a typical HPC setup with nodes scattered all over a fat tree
topology (including oversubscription of the links and switches) is enough of
a minus that an optimized GigE set up can compete; at least up to 48 nodes
(96 cpus in our case). I have worked with Giuseppe Ciaccio for the past 9
months eradicating some obscure bugs in GAMMA. I find them; he fixes them.
We have GAMMA running on 48 nodes quite reliably but there are still many
issues to address. GAMMA is very much a research tool-there are a number of
features(?) which would hinder it being used in an HPC environment.
Basically Giuseppe needs help with development. Any volunteers?

Tony Ladd
Professor, Chemical Engineering
University of Florida
PO Box 116005
Gainesville, FL 32611-6005

Tel: 352-392-6509
FAX: 352-392-9513
Email: tladd_at_[hidden]