Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Tim Prins (tprins_at_[hidden])
Date: 2007-09-21 10:07:46


Aurelien and Brian.

Thanks for the suggestions. I reran the runs with --without-memory-manager and
got (on 2 of 5000 runs):
*** glibc detected *** corrupted double-linked list: 0xf704dff8 ***
on one and
*** glibc detected *** malloc(): memory corruption: 0xeda00c70 ***
on the other.

So it looks like somewhere we are over-running our allocated space. So now I
am attempting to redo the run with valgrind.

Tim

On Thursday 20 September 2007 09:59:14 pm Brian Barrett wrote:
> On Sep 20, 2007, at 7:02 AM, Tim Prins wrote:
> > In our nightly runs with the trunk I have started seeing cases
> > where we
> > appear to be segfaulting within/below malloc. Below is a typical
> > output.
> >
> > Note that this appears to only happen on the trunk, when we use
> > openib,
> > and are in 32 bit mode. It seems to happen randomly at a very low
> > frequency (59 out of about 60,000 32 bit openib runs).
> >
> > This could be a problem with our machine, and has showed up since I
> > started testing 32bit ofed 10 days ago.
> >
> > Anyways, just curious if anyone had any ideas.
>
> As someone else said, this usually points to a duplicate free or the
> like in malloc. You might want to try compiling with --without-
> memory-manager, as the ptmalloc2 in glibc frequently is more verbose
> about where errors occurred than is the one in Open MPI.
>
> Brian
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel