Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] 3.5 seconds before application launches
From: Vittorio Giovara (vitto.giova_at_[hidden])
Date: 2009-02-27 09:39:54


Hello, and thanks for both replies,

I've tried to run non-mpi program but i still measured some latency time
before starting, something around 2 seconds this time.
SSH should be properly configured, in fact i can login to both machines
without password; openmpi and mvapich use ssh as default.

i've tried these commands
mpirun --mca btl ^sm -np 2 -host node0 -host node1 ./graph
mpirun --mca btl openib,self -np 2 -host node0 -host node1 ./graph

and, apart a slight performance increase in the ^sm benchmark, the latency
time is the same
this is really strange, but i can't figure out the source!
do you have any other ideas?
thanks
Vittorio

Date: Wed, 25 Feb 2009 20:20:51 -0500
From: Jeff Squyres <jsquyres_at_[hidden]>
Subject: Re: [OMPI users] 3.5 seconds before application launches
To: Open MPI Users <users_at_[hidden]>
Message-ID: <86D3B246-1866-4B84-B05C-4D13659F8F1C_at_[hidden]>
Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes

Dorian raises a good point.

You might want to try some simple tests of launching non-MPI codes
(e.g., hostname, uptime, etc.) and see how they fare. Those will more
accurately depict OMPI's launching speeds. Getting through MPI_INIT
is another matter (although on 2 nodes, the startup should be pretty
darn fast).

Two other things that *may* impact you:

1. Is your ssh speed between the machines slow? OMPI uses ssh by
default, but will fall back to rsh (or you can force rsh if you
want). MVAPICH may use rsh by default...? (I don't actually know)

2. OMPI may be spending time creating shared memory files. You can
disable OMPI's use of shared memory by running with:

    mpirun --mca btl ^sm ...

Meaning "use anything except the 'sm' (shared memory) transport for
MPI messages".

On Feb 25, 2009, at 4:01 PM, doriankrause wrote:

> Vittorio wrote:
>> Hi!
>> I'm using OpenMPI 1.3 on two nodes connected with Infiniband; i'm
>> using
>> Gentoo Linux x86_64.
>>
>> I've noticed that before any application starts there is a variable
>> amount
>> of time (around 3.5 seconds) in which the terminal just hangs with
>> no output
>> and then the application starts and works well.
>>
>> I imagined that there might have been some initialization routine
>> somewhere
>> in the Infiniband layer or in the software stack, but as i
>> continued my
>> tests i observed that this "latency" time is not present in other MPI
>> implementations (like mvapich2) where my application starts
>> immediately (but
>> performs worse).
>>
>> Is my MPI configuration/installation broken or is this expected
>> behaviour?
>>
>
> Hi,
>
> I'm not really qualified to answer this question, but I know that in
> contrast
> to other MPI implementations (MPICH) the modular structure of Open
> MPI is based
> on shared libs that are dlopened at the startup. As symbol
> relocation can be
> costly this might be a reason why the startup time is higher.
>
> Have you checked wether this is an mpiexec start issue or the
> MPI_Init call?
>
> Regards,
> Dorian
>
>> thanks a lot!
>> Vittorio
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

--
Jeff Squyres
Cisco Systems