Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] memory per core/process
From: Duke Nguyen (duke.lists_at_[hidden])
Date: 2013-04-01 02:30:23


On 3/31/13 12:20 AM, Duke Nguyen wrote:
> I should really have asked earlier. Thanks for all the helps.

I think I was excited too soon :). Increasing stacksize does help if I
run a job in a dedicated server. Today I tried to modify the cluster
(/etc/security/limits.conf, /etc/init.d/pbs_mom) and tried to run a
different job with 4 nodes/8 core each (nodes=4:ppn=8), but I still get
the mpirun error. My ulimit now reads:

$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 8271027
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 32768
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) unlimited
cpu time (seconds, -t) unlimited
max user processes (-u) 8192
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited

Any other advice???

>
> On 3/30/13 10:28 PM, Ralph Castain wrote:
>> FWIW: there is an MCA param that helps with such problems:
>>
>> opal_set_max_sys_limits
>> "Set to non-zero to automatically set any
>> system-imposed limits to the maximum allowed",
>>
>> At the moment, it only sets the limits on number of files open, and
>> max size of a file we can create. Easy enough to add the stack size,
>> though as someone pointed out, it has some negatives as well.
>>
>>
>> On Mar 30, 2013, at 7:35 AM, Gustavo Correa <gus_at_[hidden]>
>> wrote:
>>
>>> On Mar 30, 2013, at 10:02 AM, Duke Nguyen wrote:
>>>
>>>> On 3/30/13 8:20 PM, Reuti wrote:
>>>>> Am 30.03.2013 um 13:26 schrieb Tim Prince:
>>>>>
>>>>>> On 03/30/2013 06:36 AM, Duke Nguyen wrote:
>>>>>>> On 3/30/13 5:22 PM, Duke Nguyen wrote:
>>>>>>>> On 3/30/13 3:13 PM, Patrick Bégou wrote:
>>>>>>>>> I do not know about your code but:
>>>>>>>>>
>>>>>>>>> 1) did you check stack limitations ? Typically intel fortran
>>>>>>>>> codes needs large amount of stack when the problem size increase.
>>>>>>>>> Check ulimit -a
>>>>>>>> First time I heard of stack limitations. Anyway, ulimit -a gives
>>>>>>>>
>>>>>>>> $ ulimit -a
>>>>>>>> core file size (blocks, -c) 0
>>>>>>>> data seg size (kbytes, -d) unlimited
>>>>>>>> scheduling priority (-e) 0
>>>>>>>> file size (blocks, -f) unlimited
>>>>>>>> pending signals (-i) 127368
>>>>>>>> max locked memory (kbytes, -l) unlimited
>>>>>>>> max memory size (kbytes, -m) unlimited
>>>>>>>> open files (-n) 1024
>>>>>>>> pipe size (512 bytes, -p) 8
>>>>>>>> POSIX message queues (bytes, -q) 819200
>>>>>>>> real-time priority (-r) 0
>>>>>>>> stack size (kbytes, -s) 10240
>>>>>>>> cpu time (seconds, -t) unlimited
>>>>>>>> max user processes (-u) 1024
>>>>>>>> virtual memory (kbytes, -v) unlimited
>>>>>>>> file locks (-x) unlimited
>>>>>>>>
>>>>>>>> So stack size is 10MB??? Does this one create problem? How do I
>>>>>>>> change this?
>>>>>>> I did $ ulimit -s unlimited to have stack size to be unlimited,
>>>>>>> and the job ran fine!!! So it looks like stack limit is the
>>>>>>> problem. Questions are:
>>>>>>>
>>>>>>> * how do I set this automatically (and permanently)?
>>>>>>> * should I set all other ulimits to be unlimited?
>>>>>>>
>>>>>> In our environment, the only solution we found is to have mpirun
>>>>>> run a script on each node which sets ulimit (as well as
>>>>>> environment variables which are more convenient to set there than
>>>>>> in the mpirun), before starting the executable. We had expert
>>>>>> recommendations against this but no other working solution. It
>>>>>> seems unlikely that you would want to remove any limits which
>>>>>> work at default.
>>>>>> Stack size unlimited in reality is not unlimited; it may be
>>>>>> limited by a system limit or implementation. As we run up to 120
>>>>>> threads per rank and many applications have threadprivate data
>>>>>> regions, ability to run without considering stack limit is the
>>>>>> exception rather than the rule.
>>>>> Even if I would be the only user on a cluster of machines, I would
>>>>> define this in any queuingsystem to set the limits for the job.
>>>> Sorry if I dont get this correctly, but do you mean I should set
>>>> this using Torque/Maui (our queuing manager) instead of the system
>>>> itself (/etc/security/limits.conf and /etc/profile.d/)?
>>> Hi Duke
>>>
>>> We do both.
>>> Set memlock and stacksize to unlimited, and increase the maximum
>>> number of
>>> open files in the pbs_mom script in /etc/init.d, and do the same in
>>> /etc/security/limits.conf.
>>> This maybe an overzealous "belt and suspenders" policy, but it works.
>>> As everybody else said, a small stacksize is a common cause of
>>> segmentation fault in
>>> large codes.
>>> Basically all codes that we run here have this problem, with too many
>>> automatic arrays, structures, etc in functions and subroutines.
>>> But also a small memlock is trouble for OFED/Infinband, and the
>>> small (default)
>>> max number of open file handles may hit the limit easily if many
>>> programs
>>> (or poorly written programs) are running in the same node.
>>> The default Linux distribution limits don't seem to be tailored for
>>> HPC, I guess.
>>>
>>> I hope this helps,
>>> Gus Correa
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>