Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] valgrind problems
From: Justin (luitjens_at_[hidden])
Date: 2009-02-26 22:25:01


Is there any tricks to getting it to work? When we run with valgrind we
get segfaults, valgrind reports errors in different MPI functions for
example:

==3629== Invalid read of size 4
==3629== at 0x1CF7AEEC: (within
/usr/lib/openmpi/lib/openmpi/mca_pml_ob1.so)
==3629== by 0x1D9C23F4: mca_btl_sm_component_progress (in
/usr/lib/openmpi/lib/openmpi/mca_btl_sm.so)
==3629== by 0x1D17F14A: mca_bml_r2_progress (in
/usr/lib/openmpi/lib/openmpi/mca_bml_r2.so)
==3629== by 0x151FCCD9: opal_progress (in
/usr/lib/openmpi/lib/libopen-pal.so.0.0.0)
==3629== by 0xD09FA94: ompi_request_wait_all (in
/usr/lib/openmpi/lib/libmpi.so.0.0.0)
==3629== by 0x1E3E47C1: ompi_coll_tuned_sendrecv_actual (in
/usr/lib/openmpi/lib/openmpi/mca_coll_tuned.so)
==3629== by 0x1E3E9105:
ompi_coll_tuned_barrier_intra_recursivedoubling (in
/usr/lib/openmpi/lib/openmpi/mca_coll_tuned.so)
==3629== by 0xD0B42FF: PMPI_Barrier (in
/usr/lib/openmpi/lib/libmpi.so.0.0.0)
==3629== by 0x7EA025E:
Uintah::DataArchiver::initializeOutput(Uintah::Handle<Uintah::ProblemSpec>
const&) (DataArchiver.cc:400)
==3629== by 0x899DDDF:
Uintah::SimulationController::postGridSetup(Uintah::Handle<Uintah::Grid>&,
double&) (SimulationController.cc:352)
==3629== by 0x89A8568: Uintah::AMRSimulationController::run()
(AMRSimulationController.cc:126)
==3629== by 0x408B9F: main (sus.cc:622)

This is then followed by a segfault.

Justin

Jeff Squyres wrote:
> On Feb 26, 2009, at 7:03 PM, Justin wrote:
>
>> I'm trying to use valgrind to check if we have any memory problems in
>> our code when running with parallel processors. However, when I run
>> using mpi and valgrind I crashes in various places. For example some
>> of the times it will crash with a segfault within MPI_Allgatherv
>> despite the fact that all the arguments to the all gather on all
>> processors is completely valid. If we don't use valgrind the
>> program runs just fine.
>> This is on a Debian(lenny) 64 bit machine using the stock mpi
>> package. The command used to launch the job is: mpirun -np 8
>> valgrind -v --log-file=valgrind.%p executable. Are valgrind and
>> openmpi compatible? Is there any special tricks to getting them to
>> work together?
>
>
> We use valgrind internally to track down leaks and other debugging
> kinds of things. So yes, it should work.
>
> I do try to keep up with the latest latest latest valgrind, though.
>