Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Tom Rosmond (rosmond_at_[hidden])
Date: 2006-01-04 17:05:11


Thanks for the quick reply. I ran my tests with a hostfile with
cedar.reachone.com slots=4

I clearly misunderstood the role of the 'slots' parameter, because
when I removed it, OPENMPI slightly outperformed LAM, which I
assume it should. Thanks for the help.

Tom

Brian Barrett wrote:

>On Jan 4, 2006, at 4:24 PM, Tom Rosmond wrote:
>
>
>
>>I have been using LAM-MPI for many years on PC/Linux systems and
>>have been quite pleased with its performance. However, at the
>>urging of the
>>LAM-MPI website, I have decided to switch to OPENMPI. For much of my
>>preliminary testing I work on a single processor workstation (see
>>the attached
>>'config.log' and ompi_info.log files for some of the specifics of
>>my system). I
>>frequently run with more than one virtual mpi processor (i.e.
>>oversubscribe
>>the real processor) to test my code. With LAM the runtime penalty
>>for this
>>is usually insignificant for 2-4 virtual processors, but with
>>OPENMPI it has
>>been prohibitive. Below is a matrix of runtimes for a simple MPI
>>matrix
>>transpose code using mpi_sendrecv( I tried other variations of
>>blocking/
>>non-blocking, synchronous/non-synchronous send/recv with similar
>>results).
>>
>> message size= 262144 bytes
>>
>> LAM OPENMPI
>> 1 proc: .02575 secs .02513 secs
>> 2 proc: .04603 secs 10.069 secs
>> 4 proc: .04903 secs 35.422 secs
>>
>>I am pretty sure that LAM exploits the fact that the virtual
>>processors are all
>>sharing the same memory, so communication is via memory and/or the
>>PCI bus
>>of the system, while my OPENMPI configuration doesn't exploit
>>this. Is this
>>a reasonable diagnosis of the dramatic difference in performance?
>>More
>>importantly, how to I reconfigure OPENMPI to match the LAM
>>performance.
>>
>>
>
>Based on the output of ompi_info, you should be using shared memory
>with Open MPI (as you are with LAM/MPI). What RPI are you using with
>LAM/MPI (just so we have some idea what you are comparing to)? And
>how are you running Open MPI (what command are you passing to mpirun,
>and if you include a hostfile, what is in that host file)?
>
>If you tell Open MPI via a hostfile that a machine has 2 cpus when it
>only has 1 and try to run 2 processes on it, you will run into severe
>performance issues. In that case, Open MPI will poll very quickly on
>the CPUs, not giving up the CPU when there is nothing to do. If Open
>MPI is told that there is only 1 cpu and you run 2 procs of the same
>job on that node, then it will be much better about giving up the
>CPU. That would be where I would start looking.
>
>If you have some test code you could share, I'd love to see it - it
>would help in duplicating your results and finding a solution...
>
>Brian
>
>
>
>