I think the confusion was my fault because --mca pml teg did not
produce errors and gave almost the same performance as Mpich2 v 1.02p1.
The reason why I cannot do what you suggest below is because the
.openmpi/mca-params.conf file if I am not mistaken would reside in my
home NFS share directory. I have installed a new 5.01 beta version of
Oscar and /home/allan is a shared directory of my head node where the
openmpi installation resides.[/home/allan/openmpi with paths in the
.bash_profile and .bashrc files] I would have to do an individual 16
installations of open mpi on each node for /opt/openmpi and the
mca-params file residing in there. Tell me if I am wrong. I might have
to do this as this is a heterogenous cluster with different brands of
ethernet cards and CPU's.
But it's a good test bed and I have no problems installing Oscar 4.2 on it.
See my later post Hpl and TCP today where I tried 0b1 without mca pml
teg and so on and get a good performance with 15 nodes and open mpi rc6.
Thank you very much,
Date: Mon, 14 Nov 2005 16:10:36 -0500 (Eastern Standard Time)
From: George Bosilca <bosilca_at_[hidden]>
Subject: Re: [O-MPI users] HPL and TCP
To: Open MPI Users <users_at_[hidden]>
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
If there are 2 Ethernet cards it's better if you can point to the one you
want to use. For that you can modify the .openmpi/mca-params.conf file in
your home directory. All of the options can go in this file so you will
not have to specify them on the mpirun command every time.
I give you here a small example that contain the host file (from where
open mpi will pick the nodes) as well as the BTL configuration.
rds_hostfile_path = /home/bosilca/.openmpi/machinefile
On the first line I specify that Open MPI is allowed to use the TCP,
shared memory and self devices. Self should always be specified otherwise
any communication to the same process will fail (it's out loopback
The second line specify that the TCP BTL is allowed to use only the eth0
interface. This line has to reflect your own configuration.
Finally the 3th one give the full path to the hostfile file.
On Mon, 14 Nov 2005, Allan Menezes wrote:
>> Dear Jeff, Sorry I could not test the cluster earlier but I am having
>> problems with one compute node.(I will have to replace it!). So I will have
>> to repeat this test with 15 nodes. Yes I had 4 NIC cards on the head node and
>> it was only eth3 that was the gigabit NIC which was communicating to other
>> eth1 gigabit Nics on the compute nodes through a gigabit switch. So though I
>> did not specify the ethernet interface in the switch --mca pml teg I was
>> getting good performance but in --mca btl tcp not specifying the interface
>> seems to create problems. I wiped out the Linux FC3 installation and tried
>> again with Oscar 4.2 but am having problems with --mca btl tcp switch. mpirun
>> --mca btl tcp --prefix /home/allan/openmpi --hostfile aa -np 16 ./xhpl The
>> hostfile aa contains the 16 hosts a1.lightning.net to a16.lightning.net. So
>> to recap the cluster is only connected to itself through the giga bit 16 port
>> switch through gigabit ethernet cards to form a LAN with an IP for each.
>> There is an extra ethernet card built into the compute motherboards that is
>> 10/100Mbps that is not connected to anything yet. Please can you tell me the
>> right mpirun command line for btl tcp for my setup? Is the hostfile right?
>> for the mpirun command above? Should it include a1.lightning.net which is the
>> head node from where I am invoking mpirun? Or should it not have the head
>> node? Thank you, Allan Message: 2 Date: Sun, 13 Nov 2005 15:51:30 -0500 From:
>> Jeff Squyres <jsquyres_at_[hidden]> Subject: Re: [O-MPI users] HPL anf TCP
>> To: Open MPI Users <users_at_[hidden]> Message-ID:
>> <f143e44670c59a2f345708e6e0fad549_at_[hidden]> Content-Type: text/plain;
>> charset=US-ASCII; format=flowed On Nov 3, 2005, at 8:35 PM, Allan Menezes
>>>>>> 1. No, I have 4 NICs on the head node and two on each of the 15 other
>>>>>> compute nodes. I use the realtek 8169 gigabit ethernet cards on the
>>>>>> compute nodes as eth1 or eth0(one only) connected to a gigabit ethernet
>>>>>> switch with bisection bandwidth of 32Gbps and a sk98lin driver 3Com built
>>>>>> in gigabit ethernet NIC card on the head node(eth3). The other ethernet
>>>>>> cards 10/100M on the head node handle a network laser printer(eth0) and
>>>>>> eth2 (10/100M) internet access. Eth1 is a spare 10/100M which I can
>>>>>> remove. The compute nodes each have two ethernet cards one 10/100Mbps
>>>>>> ethernet not connected to anything(built in to M/B) and a PCI realtek 8169
>>>>>> gigabit ethernet connected to the TCP network LAN(Gigabit). When I tried
>>>>>> it without the switches -mca pml teg the maximum performace I would get
>>>>>> with it was 9GFlops for P=4 Q=4 N=approx 12- 16 thousand and NB
>>>>>> ridiculously low at 10 block size. If I tried bigger block sizes it would
>>>>>> run for along time for large N ~ 16,000 unless I killed xhpl. I use atlas
>>>>>> BLAS 3.7.11 libs compiled for each node and linked to HPL when creating
>>>>>> xhpl. I also use open mpi mpicc in the hpl make file for compile and link
>>>>>> both. Maybe I should according to the new faq use the TCP switch to use
>>>>>> eth3 on the head node?
>> So if I'm reading that right, there's only one network that connects the head
>> node and the compute nodes, right?
>> That's right!