Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] 3.5 seconds before application launches
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2009-02-25 20:20:51

Dorian raises a good point.

You might want to try some simple tests of launching non-MPI codes
(e.g., hostname, uptime, etc.) and see how they fare. Those will more
accurately depict OMPI's launching speeds. Getting through MPI_INIT
is another matter (although on 2 nodes, the startup should be pretty
darn fast).

Two other things that *may* impact you:

1. Is your ssh speed between the machines slow? OMPI uses ssh by
default, but will fall back to rsh (or you can force rsh if you
want). MVAPICH may use rsh by default...? (I don't actually know)

2. OMPI may be spending time creating shared memory files. You can
disable OMPI's use of shared memory by running with:

     mpirun --mca btl ^sm ...

Meaning "use anything except the 'sm' (shared memory) transport for
MPI messages".

On Feb 25, 2009, at 4:01 PM, doriankrause wrote:

> Vittorio wrote:
>> Hi!
>> I'm using OpenMPI 1.3 on two nodes connected with Infiniband; i'm
>> using
>> Gentoo Linux x86_64.
>> I've noticed that before any application starts there is a variable
>> amount
>> of time (around 3.5 seconds) in which the terminal just hangs with
>> no output
>> and then the application starts and works well.
>> I imagined that there might have been some initialization routine
>> somewhere
>> in the Infiniband layer or in the software stack, but as i
>> continued my
>> tests i observed that this "latency" time is not present in other MPI
>> implementations (like mvapich2) where my application starts
>> immediately (but
>> performs worse).
>> Is my MPI configuration/installation broken or is this expected
>> behaviour?
> Hi,
> I'm not really qualified to answer this question, but I know that in
> contrast
> to other MPI implementations (MPICH) the modular structure of Open
> MPI is based
> on shared libs that are dlopened at the startup. As symbol
> relocation can be
> costly this might be a reason why the startup time is higher.
> Have you checked wether this is an mpiexec start issue or the
> MPI_Init call?
> Regards,
> Dorian
>> thanks a lot!
>> Vittorio
>> ------------------------------------------------------------------------
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
> _______________________________________________
> users mailing list
> users_at_[hidden]

Jeff Squyres
Cisco Systems