Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] 4 PCI-Express Gigabit ethernet NIcs
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2009-05-09 07:51:04


On May 8, 2009, at 9:01 PM, Allan Menezes wrote:

> Does openmpi version 1.3.2 for Fedora Core 10 x86_64 work with 4
> gigabit pci-express ethernet cards per node stably.
>

It should. I routinely test over 3 or 4 IP interfaces (to include
IPoIB and 1GB/10GB NICs).

> I tried it on six Asus P5Q-VM motherboards with 4 cards and 8GB ram
> and
> Intel Quad core Cpus each as follows:
> eth0 - intel pro 1000 pt pci express gigabit cards.
> eth1 - TP LINK's TG-3468 realtek r8111B chipset pci express gigabit
> ethernet
> eth2 - realtek 8111C chipset gigabit pci express ethernet builtv in
> on mobo
> eth3 - TP LINK's TG-3468 realtek r8111B chipset pci express gigabit
> ethernet
> with all using mtu's of 3000 and the latest intel and realtek drivers
> from their respective websites
> and hand configured and compiled kernel 2.6.28.4
> I tried hpl-2.0 and gotoblas for checking my cluster and get approx
> 220
> GFlops if i use
> only eth0, eth1,eth3 or eth0, eth2, eth3 stably
> but i get 203 GFlops with eth0, eth1,eth2,eth3 and the hpl test fail
> after about the third test.
>

Are you saying that after three consecutive tests, the IP devices
start failing? If so, it sounds like a kernel / driver problem. If
you reboot, do the problems go away?

What happens if you restrict OMPI to just 1 device and run HPL 5-10
times on each device? Do you see the same degradation? E.g., can you
localize which device is causing problems?

> Any help would be very much appreciated as i would like to use 4 eth
> cards per node.
> Note: the measured performance of all cards is approximately 922
> MBits/s
> with jumbo frames of 3000
> using Netpipe and NPtcp and with four cards between two nodes i
> measure
> with NPmpi
> compiled with openmpi approximately 3400 Mbits/s which is good! Scales
> linearly with 4 times 900 Mbits/sec
> THank you,
> Allan Menezes
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
Cisco Systems