Subject: Re: [OMPI devel] [OMPI users] Memory manager
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2007-11-28 11:27:47

On Nov 27, 2007, at 5:13 PM, Terry Frankcombe wrote:

> ==20671== Conditional jump or move depends on uninitialised value(s)
> ==20671== at 0x40152B1: (within /lib/
> ==20671== by 0x400A289: (within /lib/
> ==20671== by 0x6A42E4D: (within /lib/
> ==20671== by 0x59AE0E3: (within /lib/
> ==20671== by 0x400D725: (within /lib/
> ==20671== by 0x59AE4EC: (within /lib/
> ==20671== by 0x59AE099: dlsym (in /lib/
> ==20671== by 0x57610FB: vm_sym
> (in /usr/local/lib/
> ==20671== by 0x575E29E: lt_dlsym
> (in /usr/local/lib/
> ==20671== by 0x57666EF: open_component
> (in /usr/local/lib/
> ==20671== by 0x576711B: mca_base_component_find
> (in /usr/local/lib/
> ==20671== by 0x5767A9F: mca_base_components_open
> (in /usr/local/lib/
> This looks particularly broken!
> I've just run valgrind on another (serial) piece of code on this
> machine
> and got three of the unitialised jumps from within,
> virtually
> identical to the first three from this MPI code. Of the 24 from the
> code, those seeming to originate from within OpenMPI are particularly
> worrying.

These are usually false positives -- in my [not comprehensive]
experience, they are typically the results of valgrind trying to
analyze optimized code where all the debugging information is not
available (and therefore it generates false positives). For example,
the one snipit above is from a supposedly uninitialized variable in
the system call dlsym(). I strongly suspect that this is not a real

As for valgrind not finding your real problem -- bummer. It can't
always find everything. :-( Perhaps try electric fence and/or other
kinds of "watch" actions to see when exactly variables change (that
might give insight into whether a buffer is being overflowed, etc.)...?

Jeff Squyres
Cisco Systems