Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Help configuring openmpi
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2008-05-12 21:01:49


If OMPI is spinning consuming 100% of your CPU, it usually means that
some MPI function call is polling waiting for completion. Given the
pattern you are seeing, I'm wondering if some Open MPI collective call
is not finishing until you re-enter the MPI progression engine.

Specifically, is your pattern like this:

- some MPI collective function
- enter a long period of computation involving no MPI calls
- call another MPI function

If so, you could well be getting bitten by what is known as an "early
completion" optimization in the Open MPI v1.2 series that allows us to
lower our latency slightly in some cases. In OMPI v1.2.6, we added an
MCA parameter to disable this behavior: set then
pml_ob1_use_early_completion MCA parameter to 0 and try your app again.

This parameter is unnecessary in the [upcoming] v1.3 series; we
changed how completions are done such that this should not be an issue.

On May 12, 2008, at 9:52 AM, Juan Carlos Larroya Huguet wrote:

> Hi,
>
> I'm using Openmpi in a linux cluster (itanium 64, intel compilers, 8
> processors (4 dual) by node) in which openmpi is not the default ( I
> mean supported) MPI-II implementation. Openmpi has been installed
> easily
> on the cluster but I think there is a problem with the configuration.
>
> I'm using two mpi codes : The first is a CFD code with a master/slave
> structure... I have done some calculations on 128 proc's... 1 master
> process and 127 slaves. Openmpi is slightly more efficient than the
> supported MPI-II version.
>
> Then I've moved to a second solver (radiant heat transfer ) ... In
> this
> case, all the processors are doing the same thing. I have found that
> after the initial phase of data reading some processors start to work
> hard and the others (even consuming 99 of CPU) are waiting for
> something! In fact I have 15 processes over 32 which are working (all
> the processes are consuming 99% of CPU...) then as soon as they finish
> the calculation the other processes start to do the job (in fact 12
> processes) and then when these 12 start to finish the remaining 4 do
> the
> job....
>
> When looking to the computational time, I obtain that with the MPI-II
> official version on the cluster...
>
> output.000: temps apres petits calculs = 170.445202827454
> output.001: temps apres petits calculs = 170.657078027725
> output.002: temps apres petits calculs = 168.880963802338
> output.003: temps apres petits calculs = 172.611718893051
> output.004: temps apres petits calculs = 169.420207977295
> output.005: temps apres petits calculs = 168.880684852600
> output.006: temps apres petits calculs = 170.222792863846
> output.007: temps apres petits calculs = 172.987339973450
> output.008: temps apres petits calculs = 170.321479082108
> output.009: temps apres petits calculs = 167.417831182480
> output.010: temps apres petits calculs = 170.633100032806
> output.011: temps apres petits calculs = 168.988963842392
> output.012: temps apres petits calculs = 166.893934011459
> output.013: temps apres petits calculs = 169.844722032547
> output.014: temps apres petits calculs = 169.541869163513
> output.015: temps apres petits calculs = 166.023182868958
> output.016: temps apres petits calculs = 166.047858953476
> output.017: temps apres petits calculs = 166.298271894455
> output.018: temps apres petits calculs = 166.990653991699
> output.019: temps apres petits calculs = 170.565690040588
> output.020: temps apres petits calculs = 170.455694913864
> output.021: temps apres petits calculs = 170.545780897141
> output.022: temps apres petits calculs = 165.962821960449
> output.023: temps apres petits calculs = 169.934472084045
> output.024: temps apres petits calculs = 170.169304847717
> output.025: temps apres petits calculs = 172.316897153854
> output.026: temps apres petits calculs = 166.030095100403
> output.027: temps apres petits calculs = 168.219340801239
> output.028: temps apres petits calculs = 165.486129045486
> output.029: temps apres petits calculs = 165.923212051392
> output.030: temps apres petits calculs = 165.996737957001
> output.031: temps apres petits calculs = 167.544650793076
>
> all the processes are more or less consuming the same CPU time
>
> and with Openmpi I've obtained that
>
> output.000: temps apres petits calculs = 158.906322956085
> output.001: temps apres petits calculs = 160.753660202026
> output.002: temps apres petits calculs = 161.286659002304
> output.003: temps apres petits calculs = 169.431221961975
> output.004: temps apres petits calculs = 163.511161088943
> output.005: temps apres petits calculs = 160.547757863998
> output.006: temps apres petits calculs = 161.222673892975
> output.007: temps apres petits calculs = 325.977787017822
> output.008: temps apres petits calculs = 321.527663946152
> output.009: temps apres petits calculs = 326.429191827774
> output.010: temps apres petits calculs = 321.229686975479
> output.011: temps apres petits calculs = 160.507288932800
> output.012: temps apres petits calculs = 158.480596065521
> output.013: temps apres petits calculs = 169.135869979858
> output.014: temps apres petits calculs = 158.526450872421
> output.015: temps apres petits calculs = 486.637645006180
> output.016: temps apres petits calculs = 483.884088993073
> output.017: temps apres petits calculs = 480.200496196747
> output.018: temps apres petits calculs = 483.166898012161
> output.019: temps apres petits calculs = 323.687628030777
> output.020: temps apres petits calculs = 319.833092927933
> output.021: temps apres petits calculs = 329.558218955994
> output.022: temps apres petits calculs = 329.199027061462
> output.023: temps apres petits calculs = 322.116630077362
> output.024: temps apres petits calculs = 322.238983869553
> output.025: temps apres petits calculs = 322.890433073044
> output.026: temps apres petits calculs = 322.439801216125
> output.027: temps apres petits calculs = 157.899522066116
> output.028: temps apres petits calculs = 159.247365951538
> output.029: temps apres petits calculs = 158.351451158524
> output.030: temps apres petits calculs = 158.714610815048
> output.031: temps apres petits calculs = 480.177379846573
>
> 15 processes have similar times (close to those obtained with the
> official MPI), hen 12, then 4 as explained previously.
>
> I suppose that we need to tune the configuration of openmpi. Do you
> know
> how to do?
>
> Thanks in advance
>
> JC
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
Cisco Systems