Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] memory per core/process
From: Duke Nguyen (duke.lists_at_[hidden])
Date: 2013-04-02 10:14:55


On 4/2/13 6:50 PM, Reuti wrote:
> Hi,
>
> Am 30.03.2013 um 14:46 schrieb Patrick Bégou:
>
>> Ok, so your problem is identified as a stack size problem. I went into these limitations using Intel fortran compilers on large data problems.
>>
>> First, it seems you can increase your stack size as "ulimit -s unlimited" works (you didn't enforce the system hard limit). The best way is to set this setting in your .bashrc file so it will works on every node.
>> But setting it to unlimited may not be really safe. IE, if you got in a badly coded recursive function calling itself without a stop condition you can request all the system memory and crash the node. So set a large but limited value, it's safer.
>>
>> I'm managing a cluster and I always set a maximum value to stack size. I also limit the memory available for each core for system stability. If a user request only one of the 12 cores of a node he can only access 1/12 of the node memory amount. If he needs more memory he has to request 2 cores, even if he uses a sequential code. This avoid crashing jobs of other users on the same node with memory requirements. But this is not configured on your node.
> This is one way to implement memory limits as a policy - it's up to the user to request the correct number of cores then although he wants to run a serial job only. Personally I prefer that the user specifies the requested memory in such a case. It's up to the queuingsystem then to avoid that additional jobs are scheduled to a machine unless the remaining memory is sufficient for their execution in such a situation.

We use Torque/Maui and I want to do similar with Torque/Maui (still
learning - those, together with openmpi are new to me). Unfortunately
posting to Torque/Maui forums are somehow too difficult (my posts were
moderated since I am the newcomer, but it seems nobody is managing those
forums so my posts were never able to get through...). I wish they were
as active as this forum...

D.

>
> -- Reuti
>
>
>> Duke Nguyen a écrit :
>>> On 3/30/13 3:13 PM, Patrick Bégou wrote:
>>>> I do not know about your code but:
>>>>
>>>> 1) did you check stack limitations ? Typically intel fortran codes needs large amount of stack when the problem size increase.
>>>> Check ulimit -a
>>> First time I heard of stack limitations. Anyway, ulimit -a gives
>>>
>>> $ ulimit -a
>>> core file size (blocks, -c) 0
>>> data seg size (kbytes, -d) unlimited
>>> scheduling priority (-e) 0
>>> file size (blocks, -f) unlimited
>>> pending signals (-i) 127368
>>> max locked memory (kbytes, -l) unlimited
>>> max memory size (kbytes, -m) unlimited
>>> open files (-n) 1024
>>> pipe size (512 bytes, -p) 8
>>> POSIX message queues (bytes, -q) 819200
>>> real-time priority (-r) 0
>>> stack size (kbytes, -s) 10240
>>> cpu time (seconds, -t) unlimited
>>> max user processes (-u) 1024
>>> virtual memory (kbytes, -v) unlimited
>>> file locks (-x) unlimited
>>>
>>> So stack size is 10MB??? Does this one create problem? How do I change this?
>>>
>>>> 2) did your node uses cpuset and memory limitation like fake numa to set the maximum amount of memory available for a job ?
>>> Not really understand (also first time heard of fake numa), but I am pretty sure we do not have such things. The server I tried was a dedicated server with 2 x5420 and 16GB physical memory.
>>>
>>>> Patrick
>>>>
>>>> Duke Nguyen a écrit :
>>>>> Hi folks,
>>>>>
>>>>> I am sorry if this question had been asked before, but after ten days of searching/working on the system, I surrender :(. We try to use mpirun to run abinit (abinit.org) which in turns will call an input file to run some simulation. The command to run is pretty simple
>>>>>
>>>>> $ mpirun -np 4 /opt/apps/abinit/bin/abinit < input.files >& output.log
>>>>>
>>>>> We ran this command on a server with two quad core x5420 and 16GB of memory. I called only 4 core, and I guess in theory each of the core should take up to 2GB each.
>>>>>
>>>>> In the output of the log, there is something about memory:
>>>>>
>>>>> P This job should need less than 717.175 Mbytes of memory.
>>>>> Rough estimation (10% accuracy) of disk space for files :
>>>>> WF disk file : 69.524 Mbytes ; DEN or POT disk file : 14.240 Mbytes.
>>>>>
>>>>> So basically it reported that the above job should not take more than 718MB each core.
>>>>>
>>>>> But I still have the Segmentation Fault error:
>>>>>
>>>>> mpirun noticed that process rank 0 with PID 16099 on node biobos exited on signal 11 (Segmentation fault).
>>>>>
>>>>> The system already has limits up to unlimited:
>>>>>
>>>>> $ cat /etc/security/limits.conf | grep -v '#'
>>>>> * soft memlock unlimited
>>>>> * hard memlock unlimited
>>>>>
>>>>> I also tried to run
>>>>>
>>>>> $ ulimit -l unlimited
>>>>>
>>>>> before the mpirun command above, but it did not help at all.
>>>>>
>>>>> If we adjust the parameters of the input.files to give the reported mem per core is less than 512MB, then the job runs fine.
>>>>>
>>>>> Please help,
>>>>>
>>>>> Thanks,
>>>>>
>>>>> D.
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>