> To:
users@open-mpi.org> Subject: Re: [OMPI users] Open MPI, Segmentation fault
>
> Hello Jack, list
>
> As others mentioned, this may be a problem with dynamic
> memory allocation.
> It could also be a violation of statically allocated memory,
> I guess.
>
> You say:
>
> > My program can run well for 1,2,10 processors, but fail when the
> > number of tasks cannot
> > be divided evenly by number of processes.
>
> Often times, when the division of the number of "tasks"
> (or the global problem size) by the number of "processors" is not even,
> one processor gets a lighter/heavier workload then the others,
> it also allocates less/more memory than the others,
> and it accesses smaller/larger arrays than the others.
>
> In general integer division and remainder/module calculations
> are used to control memory allocation, the array sizes, etc,
> on different processors.
> These formulas tend to use the MPI communicator size
> (i.e., effectively the number of processors if you are using
> MPI_COMM_WORLD) to split the workload across the processors.
>
> I would search for the lines of code where those calculations are done,
> and where the arrays are allocated and accessed,
> to make sure the algorithm works both when
> they are of the same size
> (even workload across the processors),
> as when they are of different sizes
> (uneven workload across the processors).
> You may be violating memory access by a few bytes only, due to a small
> mistake in one of those integer division / remainder/module formulas,
> perhaps where an array index upper or lower bound is calculated.
> It happened to me before, probably to others too.
>
> This type of code inspection can be done without a debugger,
> or before you get to the debugger phase.
>
> I hope this helps,
> Gus Correa
> ---------------------------------------------------------------------
> Gustavo Correa
> Lamont-Doherty Earth Observatory - Columbia University
> Palisades, NY, 10964-8000 - USA
> ---------------------------------------------------------------------
>
> > Jeff Squyres wrote:
> > Also see
http://www.open-mpi.org/faq/?category=debugging.
> >
> > On Jul 1, 2010, at 3:17 AM, Asad Ali wrote:
> >
> >> Hi Jack,
> >>
> >> Debugging OpenMPI with traditional debuggers is a pain.
> >> >From your error message it sounds that you have some memory allocation problem. Do you use dynamic memory allocation (allocate and then free)?
> >>
> >> I use display (printf()) command with MPIrank command. It tells me which thread is giving segmentation fault.
> >>
> >> Cheers,
> >>
> >> Asad
> >>
> >> On Thu, Jul 1, 2010 at 4:13 PM, Jack Bryan <
dtustudy68@hotmail.com> wrote:
> >> thanks
> >>
> >> I am not familiar with OpenMPI.
> >>
> >> Would you please help me with how to ask openMPI to show where the fault occurs ?
> >>
> >> GNU debuger ?
> >>
> >> Any help is appreciated.
> >>
> >> thanks!!!
> >>
> >> Jack
> >>
> >> June 30 2010
> >>
> >> Date: Wed, 30 Jun 2010 16:13:09 -0400
> >> From:
amjad11@gmail.com
> >> To:
users@open-mpi.org> >> Subject: Re: [OMPI users] Open MPI, Segmentation fault
> >>
> >>
> >> Based on my experiences, I would FULLY endorse (100% agree with) David Zhang.
> >> It is usually a coding or typo mistake.
> >>
> >> At first, Ensure that array sizes and dimension are correct.
> >>
> >> I experience that if openmpi is compiled with gnu compilers (not with Intel) then it also point outs the subroutine exactly in which the fault occur. have a try.
> >>
> >> best,
> >> AA
> >>
> >>
> >>
> >> On Wed, Jun 30, 2010 at 12:43 PM, David Zhang <
solarbikedz@gmail.com> wrote:
> >> When I got segmentation faults, it has always been my coding mistakes. Perhaps your code is not robust against number of processes not divisible by 2?
> >>
> >> On Wed, Jun 30, 2010 at 8:47 AM, Jack Bryan <
dtustudy68@hotmail.com> wrote:
> >> Dear All,
> >>
> >> I am using Open MPI, I got the error:
> >>
> >> n337:37664] *** Process received signal ***
> >> [n337:37664] Signal: Segmentation fault (11)
> >> [n337:37664] Signal code: Address not mapped (1)
> >> [n337:37664] Failing at address: 0x7fffcfe90000
> >> [n337:37664] [ 0] /lib64/libpthread.so.0 [0x3c50e0e4c0]
> >> [n337:37664] [ 1] /lustre/home/rhascheduler/RhaScheduler-0.4.1.1/mytest/nmn2 [0x414ed7]
> >> [n337:37664] [ 2] /lib64/libc.so.6(__libc_start_main+0xf4) [0x3c5021d974]
> >> [n337:37664] [ 3] /lustre/home/rhascheduler/RhaScheduler-0.4.1.1/mytest/nmn2(__gxx_personality_v0+0x1f1) [0x412139]
> >> [n337:37664] *** End of error message ***
> >>
> >> After searching answers, it seems that some functions fail.
> >>
> >> My program can run well for 1,2,10 processors, but fail when the number of tasks cannot
> >> be divided evenly by number of processes.
> >>
> >> Any help is appreciated.
> >>
> >> thanks
> >>
> >> Jack
> >>
> >> June 30 2010
> >>
> >>
> >> The New Busy think 9 to 5 is a cute idea. Combine multiple calendars with Hotmail. Get busy.
> >>
> >> _______________________________________________
> >> users mailing list
> >>
users@open-mpi.org> >>
http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>
> >>
> >>
> >> --
> >> David Zhang
> >> University of California, San Diego
> >>
> >> _______________________________________________
> >> users mailing list
> >>
users@open-mpi.org> >>
http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>
> >>
> >> Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox. Learn more.
> >>
> >> _______________________________________________
> >> users mailing list
> >>
users@open-mpi.org> >>
http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>
> >>
> >>
> >> --
> >> "Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write." - H.G. Wells
> >> _______________________________________________
> >> users mailing list
> >>
users@open-mpi.org> >>
http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> >
>
>
> _______________________________________________
> users mailing list
>
users@open-mpi.org>
http://www.open-mpi.org/mailman/listinfo.cgi/users