I run a 6 parallel processes on OpenMPI.
When the run-time of the program is short, it works well.
But, if the run-time is long, I got errors:
[n124:45521] *** Process received signal ***[n124:45521] Signal: Segmentation fault (11)[n124:45521] Signal code: Address not mapped (1)[n124:45521] Failing at address: 0x44[n124:45521] [ 0] /lib64/libpthread.so.0 [0x3c50e0e4c0][n124:45521] [ 1] /lib64/libc.so.6(strlen+0x10) [0x3c50278d60][n124:45521] [ 2] /lib64/libc.so.6(_IO_vfprintf+0x4479) [0x3c50246b19][n124:45521] [ 3] /lib64/libc.so.6(_IO_printf+0x9a) [0x3c5024d3aa][n124:45521] [ 4] /home/path/exec [0x40ec9a][n124:45521] [ 5] /lib64/libc.so.6(__libc_start_main+0xf4) [0x3c5021d974][n124:45521] [ 6] /home/path/exec [0x401139][n124:45521] *** End of error message ***
It seems that there may be some problems about memory management.
But, I cannot find the reason.
My program needs to write results to some files.
If I open the files too many without closing them, I may get the above errors.
But, I have removed the writing files from my program.
The problem appears again when the program runs longer time.
Any help is appreciated.
July 25 2010
Hotmail is redefining busy with tools for the New Busy. Get more from your inbox.