Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] jobs with more that 2, 500 processes will not even start
From: Lydia Heck (lydia.heck_at_[hidden])
Date: 2010-12-14 12:32:54


I have experimented a bit more and found that if I set

OMPI_MCA_plm_rsh_num_concurrent=1024

a job with more than 2,500 processes will start and run.

However when I searched the open-mpi web site for the the variable I could not
find any indication.

Best wishes,
Lydia Heck

> 15. jobs with more that 2, 500 processes will not even start
> (Lydia Heck)
>
> ------------------------------
>
> Message: 15
> Date: Tue, 14 Dec 2010 16:10:01 +0000 (GMT)
> From: Lydia Heck <lydia.heck_at_[hidden]>
> Subject: [OMPI users] jobs with more that 2, 500 processes will not
> even start
> To: users_at_[hidden]
> Message-ID:
> <alpine.LRH.2.00.1012141549220.20537_at_[hidden]>
> Content-Type: TEXT/PLAIN; format=flowed; charset=US-ASCII
>
>
> About 9 months ago we had a new installation with a system of 1800 cores and at
> the time we found that jobs with more than 1028 cores would not start. At the
> time a colleague found that setting
>
> OMPI_MCA_plm_rsh_num_concurrent=256
>
> help with the problem.
>
> We have now increased our processor count to more than 2700 cores and a job with
> 2,500 jobs does not start.
>
> Is there any advice?
>
> Best wishes,
>
> Lydia Heck
>