Have you tried to run your aplication under valgrind?
Even though applications generallay run slower under valgrind,
it may detect memory errors before the actual crash happens.
The best would be to start a terminal window for each of your processes
so you can see valgrind's output for each process separately.
On Mon, Jul 26, 2010 at 4:08 AM, Jack Bryan <dtustudy68_at_[hidden]> wrote:
> Dear All,
> I run a 6 parallel processes on OpenMPI.
> When the run-time of the program is short, it works well.
> But, if the run-time is long, I got errors:
> [n124:45521] *** Process received signal ***
> [n124:45521] Signal: Segmentation fault (11)
> [n124:45521] Signal code: Address not mapped (1)
> [n124:45521] Failing at address: 0x44
> [n124:45521] [ 0] /lib64/libpthread.so.0 [0x3c50e0e4c0]
> [n124:45521] [ 1] /lib64/libc.so.6(strlen+0x10) [0x3c50278d60]
> [n124:45521] [ 2] /lib64/libc.so.6(_IO_vfprintf+0x4479) [0x3c50246b19]
> [n124:45521] [ 3] /lib64/libc.so.6(_IO_printf+0x9a) [0x3c5024d3aa]
> [n124:45521] [ 4] /home/path/exec [0x40ec9a]
> [n124:45521] [ 5] /lib64/libc.so.6(__libc_start_main+0xf4) [0x3c5021d974]
> [n124:45521] [ 6] /home/path/exec [0x401139]
> [n124:45521] *** End of error message ***
> It seems that there may be some problems about memory management.
> But, I cannot find the reason.
> My program needs to write results to some files.
> If I open the files too many without closing them, I may get the above
> But, I have removed the writing files from my program.
> The problem appears again when the program runs longer time.
> Any help is appreciated.
> July 25 2010
> Hotmail is redefining busy with tools for the New Busy. Get more from your
> inbox. See how.
> users mailing list