Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Brian Barrett (brbarret_at_[hidden])
Date: 2006-03-23 21:48:56


On Mar 22, 2006, at 1:47 PM, Michael Kluskens wrote:

> Trying to find the cause of one or more errors, might involve
> libopal.so
>
> Built openmpi-1.1a1r9351 on Debian Linux on Operton with PGI 6.1-3
> using "./configure --with-gnu-ld F77=pgf77 FFLAGS=-fastsse FC=pgf90
> FCFLAGS=-fastsse"
>
> My program generates the following error which I do not understand:
>
> Signal:11 info.si_errno:0(Success) si_code:1(SEGV_MAPERR)
> Failing at addr:0x4
> [0] func:/usr/local/lib/libopal.so.0 [0x2a959927dd]
> *** End of error message ***
>
> Is it possible I'm over running the OpenMPI buffers, my test program
> works fine other than the "GPR data corruption" errors (uses
> MPI_SPAWN and posted previously); the basic MPI difference between my
> test program and the real program is massive amount of data being
> distributed via BCAST and SEND/RECV.

It worries me that the call stack only goes that deep - there should
be more functions listed there (if nothing else, the main()
function). Can you run your application in a debugger and try to get
a full stack trace? Typically, segmentation faults point to
overwriting user buffers, but without more detail, it's hard to pin-
point the issue.

Thanks,

Brian

-- 
   Brian Barrett
   Open MPI developer
   http://www.open-mpi.org/