Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Durga Choudhury (dpchoudh_at_[hidden])
Date: 2006-10-24 10:28:29


Very interesting, indeed! Message passing running over raw Ethernet using
cheap COTS PCs is indeed the need of the hours for people like me who has a
very shallow pocket. Great work! What would make this effort *really* cool
is to have a one-to-one mapping of APIs from MPI domain to GAMMA domain, so,
for example, existing MPI code can be ported with a trivial amount of work.
Professor Ladd, how did you do this porting, e.g. for VASP? How much of an
effort was it? (Or did the VASP guys already had a version running over
GAMMA ?)

Thanks
Durga

On 10/24/06, Tony Ladd <ladd_at_[hidden]> wrote:
>
> Lisandro
>
> I use my own network testing program; I wrote it some time ago because
> Netpipe only tested 1-way rates at that point. I havent tried IMB but I
> looked at the source and its very similar to what I do. 1) set up buffers
> with data. 2) Start clock 3) Call MPI_xxx N times 4) Stop clock 5)
> calculate
> rate. IMB tests more things than I do; I just focused on the calls I use
> (send recv allreduce). I have done a lot of testing of hardware and
> software. I will have some web pages posted soon. I will put a note here
> when I do. But a couple of things.
> A) I have found the switch is the biggest discriminant if you want to run
> HPC under Gigabit ethernet. Most GigE switches choke when all the ports
> are
> being used at once. This is the usual HPC pattern, but not of a typical
> network, which is what these switches are geared towards. The one
> exception
> I have found is the Extreme Networks x450a-48t. In some test patterns I
> found it to be 500 times faster (not a typo) than the s400-48t, which is
> its
> predecessor. I have tested several GigE switches (Extreme, Force10, HP,
> Asante) and the x450 is the only one that copes with high traffic loads in
> all port configurations. Its expensive for a GigE switch (~$6500) but
> worth
> it in my opinion if you want to do HPC. Its still much cheaper than
> Infiniband.
> B) You have to test the switch in different port configurations-a random
> ring of SendRecv is good for this. I don't think IMB has it in its test
> suite but its easy to program. Or you can change the order of nodes in the
> machinefile to force unfavorable port assignments. A step of 12 is a good
> test since many GigE switches use 12-port ASICS and this forces all the
> traffic onto the backplane. On the Summit 400 this causes it to more or
> less
> stop working-rates drop to a few Kbytes/sec along each wire, but the x450
> has no problem with the same test. You need to know how your nodes are
> wired
> to the switch to do this test.
> C) GAMMA is an extraordinary accomplishment in my view; in a number of
> tests
> with codes like DLPOLY, GROMACS, VASP it can be 2-3 times the speed of TCP
> based programs with 64 cpus. In many instances I get comparable (and
> occasionally better) scaling than with the university HPC system which has
> an Infiniband interconnect. Note I am not saying GigE is comparable to IB;
> but that a typical HPC setup with nodes scattered all over a fat tree
> topology (including oversubscription of the links and switches) is enough
> of
> a minus that an optimized GigE set up can compete; at least up to 48 nodes
> (96 cpus in our case). I have worked with Giuseppe Ciaccio for the past 9
> months eradicating some obscure bugs in GAMMA. I find them; he fixes them.
> We have GAMMA running on 48 nodes quite reliably but there are still many
> issues to address. GAMMA is very much a research tool-there are a number
> of
> features(?) which would hinder it being used in an HPC environment.
> Basically Giuseppe needs help with development. Any volunteers?
>
> Tony
> -------------------------------
> Tony Ladd
> Professor, Chemical Engineering
> University of Florida
> PO Box 116005
> Gainesville, FL 32611-6005
>
> Tel: 352-392-6509
> FAX: 352-392-9513
> Email: tladd_at_[hidden]
> Web: http://ladd.che.ufl.edu
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

-- 
Devil wanted omnipresence;
He therefore created communists.