Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Valgrind writev() errors with 1.3.2.
From: Ralph Castain (rhc_at_[hidden])
Date: 2009-06-09 14:41:59


We have done similar things in our code base, and I may explore that option
here. Doing it in too many places gets to be icky, though, so I'll have to
look at it before deciding on the course of action.

Thanks!
Ralph

On Tue, Jun 9, 2009 at 12:13 PM, tom fogal <tfogal_at_[hidden]> wrote:

> Ralph Castain <rhc_at_[hidden]> writes:
> > I can't speak to all of the OMPI code, but I can certainly create
> > a new configure option --valgrind-friendly that would initialize
> > the OOB comm buffers and other RTE-related memory to eliminate such
> > warnings.
>
> That would be excellent, thank you for offering.
>
> > I would prefer to configure it out rather than adding a bunch of
> > "if-then" checks for envars to avoid having the performance hit when
> > not needed.
>
> FWIW, we've solved this before by using function pointers initialized
> on load, e.g. (warning, untested pseudocode):
>
> void mymethod(int stuff) {
> do(stuff);
> }
> void mymethod_debug(int stuff) {
> internal_consistency_check();
> do(stuff);
> }
> void (*method)(int);
> ...
> void init() {
> method = mymethod;
> if(getenv("DEBUGGING") != NULL) {
> method = mymethod_debug;
> }
> }
>
> void algorithm() {
> ...
> method(42);
> ...
> }
>
> You'd only pay the branch during the one-time init(). Of course, the
> method can't be inlined anymore either.
>
> Anyway, I realize that's quite a bit more work. Preferred, but the
> configure check would suffice for most of my needs.
>
> > Would that help?
>
> Tremendously, thank you.
>
> -tom
>
> > On Tue, Jun 9, 2009 at 11:40 AM, tom fogal <tfogal_at_[hidden]>
> wrote:
> >
> > > jody <jody.xha_at_[hidden]> writes:
> > > > I made a suppression file for the irrelevant memory leaks of ompi: I
> > > > make no claim that it catches all possible ones, but it catches all
> > > > that appear in my code.
> > > [snip]
> > >
> > > Thanks, Jody.
> > >
> > > What are the chances something like this could be added / maintained in
> > > the OpenMPI tree? It would be great to have something 1) maintained by
> > > someone more knowledgeable about these errors than me, and 2) installed
> > > by default when I setup my toolchain for parallel debugging.
> > >
> > > > On Tue, Jun 9, 2009 at 3:28 PM, Jeff Squyres<jsquyres_at_[hidden]>
> wrote:
> > > > > This is worth adding to the FAQ.
> > > > >
> > > > > On Jun 9, 2009, at 2:31 AM, Ashley Pittman wrote:
> > > > >
> > > > >> On Mon, 2009-06-08 at 23:41 -0600, tom fogal wrote:
> > > > >> > George Bosilca <bosilca_at_[hidden]> writes:
> > > > >> > > There is a whole page on valgrind web page about this topic.
> > > Please
> > > > >> > > read
> > > > >> > >
> > > http://valgrind.org/docs/manual/manual-core.html#manual-core.suppress
> > > > >> > > for more information.
> > > > >> >
> > > > >> > Even better, Ralph (et al.) is if we could just make valgrind
> think
> > > > >> > this is defined memory. One can do this with client requests:
> > > > >> >
> > > > >> >
> > > http://valgrind.org/docs/manual/mc-manual.html#mc-manual.clientreqs
> > > > >>
> > > > >> Using the Valgrind client requests unnecessarily is a very bad
> idea,
> > > > >> they are intended for where applications use their own memory
> > > allocator
> > > > >> (i.e. replace malloc/free) or are using custom kernel modules or
> > > > >> hardware which Valgrind doesn't know about.
> > >
> > > Okay, sure, I realize it was a bit of an abuse of the intended use of
> > > the tool.
> > >
> > > > >> The correct solution is either to not send un-initialised memory
> > > > >> or to suppress the error using a suppression file as George
> > > > >> said. As the error is from MPI_Init() you can safely ignore it
> > > > >> from a end-user perspective.
> > >
> > > As I mentioned in my initial message, MPI_Init is only one such
> > > error; I get them in a lot of MPI calls, seemingly anything that does
> > > communication. Though I've heard differently on this list, this led me
> > > to believe I was doing something wrong in my code.
> > >
> > > It seems like the only way I could verify that I'm not causing these
> > > errors myself is to grok the call stacks I'm given for each vg error
> > > and figure out where the uninitialized memory comes from, and then make
> > > a judgement call for myself whether this makes sense to suppress. Or
> > > I could mail the list about every error I see and ask for confirmation
> > > that it's benign/suppressable. Most likely, I'll take the simple
> > > approach and just use the suppression file I was given, but that's
> > > prone to be fragile and break with a future OpenMPI release.
> > >
> > > What about an environment variable which enables slower,
> > > valgrind-friendly behavior? There's precedent in other libraries, e.g.
> > > glib [1].
> > >
> > > -tom
> > >
> > > [1] http://library.gnome.org/devel/glib/stable/glib-running.html
>