On Feb 25, 2009, at 12:25 PM, Ken Mighell wrote:
> We are trying to compile the code with Open MPI on a Mac Pro with 2
> quad-core Xeons using gfortran.
> The code seem to be running ... for the most part. Unfortunately we
> keep getting a segfault
> which spits out a variant of the following message:
> [oblix:21522] *** Process received signal ***
> [oblix:21522] Signal: Segmentation fault (11)
> [oblix:21522] Signal code: Address not mapped (1)
> [oblix:21522] Failing at address: 0xc0000710
> [oblix:21522] [ 0] 2 libSystem.B.dylib 0x92a892bb
> _sigtramp + 43
> [oblix:21522] [ 1] 3 ??? 0xffffffff
> 0x0 + 4294967295
> [oblix:21522] [ 2] 4 exe.out 0x0001281b
> MAIN__ + 4875
> [oblix:21522] [ 3] 5 exe.out 0x00013c38
> main + 40
> [oblix:21522] [ 4] 6 exe.out 0x00001936
> start + 54
> [oblix:21522] *** End of error message ***
> After some researching of the error message, and digging around in
> the Open MPI user's mailing list,
> it appears that the bug may be in Open MPI.
I'm not sure what you mean by this -- getting a stack trace out of
Open MPI doesn't necessarily mean a bug in Open MPI.
Can you get corefile and look and see what exactly failed? Or run
under a debugger to see where/how exactly the process fails? From the
stack trace above, it looks like the failure occurs in application
code, not Open MPI...?