Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Max number of processes per host for an OMPI run?
From: Francesco Simula (francesco.simula_at_[hidden])
Date: 2013-09-12 10:30:10


I confirm that raising the max 'open files' limit to 2048 allows
launching up to 510 processes per node.

By the way, I just discovered that launching the processes while being
logged directly onto the host instead of the front-end machine gives a
clearer error message that would have probably tipped me off:

[cut]
[fsimula_at_q012 ~]$ mpirun -np 255 -host q012 uptime | wc -l
[q012.qng:31455] [[22942,0],0] ORTE_ERROR_LOG: The system limit on
number of pipes a process can open was reached in file
base/odls_base_default_fns.c at line 1739
--------------------------------------------------------------------------
mpirun was unable to start the specified application as it encountered
an error
on node q012.qng. More information may be available above.
--------------------------------------------------------------------------

[fsimula_at_q012 ~]$ ulimit -n 2048
[fsimula_at_q012 ~]$ mpirun -np 510 -host q012 uptime | wc -l
510
[/cut]

Many thanks to both Jeff and Ralph for pointing me in the right
direction.
Francesco

Il 2013-09-11 09:46 Jeff Squyres (jsquyres) ha scritto:
> As Ralph said, you're probably running out of file descriptors;
> mpirun uses a few (2-3? I don't remember offhand) for each MPI
> process
> launched.
>
> There are many factors that can cause limits like this -- file
> descriptors are only one. It very much depends on the configuration
> of the machine on which you're running. My point: Sorry, but it'll
> likely take some experimentation on your part to figure out how many
> you can run on a single machine.
>
>
> On Sep 10, 2013, at 4:10 PM, Francesco Simula
> <francesco.simula_at_[hidden]> wrote:
>
>> Dear forum,
>>
>> I probably must apologize in advance for the very basic question but
>> I wasn't able to find an answer elsewhere:
>> how do I find the maximum number of processes that can be
>> concurrently instantiated by mpirun on one single host of a cluster?
>>
>> If I launch (on an CentOS 6.3 cluster with quad-core dual Xeons
>> nodes, equipped with OpenMPI 1.5.4 and IB HCAs but I think this latter
>> is of no consequence):
>>
>> [cut]
>> mpirun -np 250 -host q012 hostname
>> [/cut]
>>
>> I expect and obtain 250 rows of:
>> [cut]
>> q012.qng
>> [/cut]
>>
>> The same for 251, 252, 253 and 254 BUT not for 255, when it returns:
>>
>> [cut]
>>
>> --------------------------------------------------------------------------
>> mpirun was unable to start the specified application as it
>> encountered an error
>> on node q012. More information may be available above.
>>
>> --------------------------------------------------------------------------
>> [/cut]
>>
>> I know that 250 processes is quite an oversubscription for a single
>> node that has no more than 8 real cores but I wanted to see the actual
>> degradation of performances instead of a crash.
>>
>> Which hard limit (in OpenMPI or in the system) am I hitting for not
>> being able to run 255 MPI processes on one single host?
>>
>> The output of ulimit -a for the user is:
>>
>> [cut]
>> ulimit -a
>> core file size (blocks, -c) 1000000
>> data seg size (kbytes, -d) unlimited
>> scheduling priority (-e) 0
>> file size (blocks, -f) unlimited
>> pending signals (-i) 95054
>> max locked memory (kbytes, -l) unlimited
>> max memory size (kbytes, -m) unlimited
>> open files (-n) 1024
>> pipe size (512 bytes, -p) 8
>> POSIX message queues (bytes, -q) 819200
>> real-time priority (-r) 0
>> stack size (kbytes, -s) 100000
>> cpu time (seconds, -t) unlimited
>> max user processes (-u) 1024
>> virtual memory (kbytes, -v) unlimited
>> file locks (-x) unlimited
>> [/cut]
>>
>> Many thanks,
>> Francesco
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users