On 3/30/13 3:13 PM, Patrick Bégou wrote:
> I do not know about your code but:
> 1) did you check stack limitations ? Typically intel fortran codes
> needs large amount of stack when the problem size increase.
> Check ulimit -a
First time I heard of stack limitations. Anyway, ulimit -a gives
$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 127368
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 1024
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
So stack size is 10MB??? Does this one create problem? How do I change this?
> 2) did your node uses cpuset and memory limitation like fake numa to
> set the maximum amount of memory available for a job ?
Not really understand (also first time heard of fake numa), but I am
pretty sure we do not have such things. The server I tried was a
dedicated server with 2 x5420 and 16GB physical memory.
> Duke Nguyen a écrit :
>> Hi folks,
>> I am sorry if this question had been asked before, but after ten days
>> of searching/working on the system, I surrender :(. We try to use
>> mpirun to run abinit (abinit.org) which in turns will call an input
>> file to run some simulation. The command to run is pretty simple
>> $ mpirun -np 4 /opt/apps/abinit/bin/abinit < input.files >& output.log
>> We ran this command on a server with two quad core x5420 and 16GB of
>> memory. I called only 4 core, and I guess in theory each of the core
>> should take up to 2GB each.
>> In the output of the log, there is something about memory:
>> P This job should need less than 717.175 Mbytes
>> of memory.
>> Rough estimation (10% accuracy) of disk space for files :
>> WF disk file : 69.524 Mbytes ; DEN or POT disk file : 14.240
>> So basically it reported that the above job should not take more than
>> 718MB each core.
>> But I still have the Segmentation Fault error:
>> mpirun noticed that process rank 0 with PID 16099 on node biobos
>> exited on signal 11 (Segmentation fault).
>> The system already has limits up to unlimited:
>> $ cat /etc/security/limits.conf | grep -v '#'
>> * soft memlock unlimited
>> * hard memlock unlimited
>> I also tried to run
>> $ ulimit -l unlimited
>> before the mpirun command above, but it did not help at all.
>> If we adjust the parameters of the input.files to give the reported
>> mem per core is less than 512MB, then the job runs fine.
>> Please help,
>> users mailing list
> users mailing list