Subject: Re: [OMPI users] OpenMPI vs Intel MPI
From: Swamy Kandadai (swamy_at_[hidden])
Date: 2009-07-02 09:47:21


I am running on a 2.66 GHz Nehalem node. On this node, the turbo mode and
hyperthreading are enabled.
When I run LINPACK with Intel MPI, I get 82.68 GFlops without much

When I ran with OpenMPI (I have OpenMPI 1.2.8 but my colleague was using
1.3.2). I was using the same MKL libraries both with OpenMPI and
Intel MPI. But with OpenMPI, the best I got so far is 80.22 GFlops and I
could never achieve close to what I am getting with Intel MPI.
Here are muy options with OpenMPI:

mpirun -n 8 --machinefile hf --mca rmaps_rank_file_path rankfile --mca
coll_sm_info_num_procs 8 --mca btl self,sm -mca mpi_leave_pinned
1 ./xhpl_ompi

Here is my rankfile:

at rankfile
rank 0=i02n05 slot=0
rank 1=i02n05 slot=1
rank 2=i02n05 slot=2
rank 3=i02n05 slot=3
rank 4=i02n05 slot=4
rank 5=i02n05 slot=5
rank 6=i02n05 slot=6
rank 7=i02n05 slot=7

In this case the physical cores are 0-7 while the additional logical
processors with hyperthreading are 8-15.
With "top" command, I could see all the 8 tasks are running on 8 different
physical cores. I did not see
2 MPI tasks running on the same physical core. Also, the program is not
paging as the problem size
fits in the meory.

Do you have any ideas how I can improve the performance so that it matches
with Intel MPI performance?
Any suggestions will be greatly appreciated.

Swamy Kandadai

