Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] valgrind problems
From: Justin (luitjens_at_[hidden])
Date: 2009-02-26 22:27:15


Also the stable version of openmpi on Debian is 1.2.7rc2. Are there any
known issues with this version and valgrid?

Thanks,
Justin

Justin wrote:
> Is there any tricks to getting it to work? When we run with valgrind
> we get segfaults, valgrind reports errors in different MPI functions
> for example:
>
> ==3629== Invalid read of size 4
> ==3629== at 0x1CF7AEEC: (within
> /usr/lib/openmpi/lib/openmpi/mca_pml_ob1.so)
> ==3629== by 0x1D9C23F4: mca_btl_sm_component_progress (in
> /usr/lib/openmpi/lib/openmpi/mca_btl_sm.so)
> ==3629== by 0x1D17F14A: mca_bml_r2_progress (in
> /usr/lib/openmpi/lib/openmpi/mca_bml_r2.so)
> ==3629== by 0x151FCCD9: opal_progress (in
> /usr/lib/openmpi/lib/libopen-pal.so.0.0.0)
> ==3629== by 0xD09FA94: ompi_request_wait_all (in
> /usr/lib/openmpi/lib/libmpi.so.0.0.0)
> ==3629== by 0x1E3E47C1: ompi_coll_tuned_sendrecv_actual (in
> /usr/lib/openmpi/lib/openmpi/mca_coll_tuned.so)
> ==3629== by 0x1E3E9105:
> ompi_coll_tuned_barrier_intra_recursivedoubling (in
> /usr/lib/openmpi/lib/openmpi/mca_coll_tuned.so)
> ==3629== by 0xD0B42FF: PMPI_Barrier (in
> /usr/lib/openmpi/lib/libmpi.so.0.0.0)
> ==3629== by 0x7EA025E:
> Uintah::DataArchiver::initializeOutput(Uintah::Handle<Uintah::ProblemSpec>
> const&) (DataArchiver.cc:400)
> ==3629== by 0x899DDDF:
> Uintah::SimulationController::postGridSetup(Uintah::Handle<Uintah::Grid>&,
> double&) (SimulationController.cc:352)
> ==3629== by 0x89A8568: Uintah::AMRSimulationController::run()
> (AMRSimulationController.cc:126)
> ==3629== by 0x408B9F: main (sus.cc:622)
>
> This is then followed by a segfault.
>
> Justin
>
> Jeff Squyres wrote:
>> On Feb 26, 2009, at 7:03 PM, Justin wrote:
>>
>>> I'm trying to use valgrind to check if we have any memory problems
>>> in our code when running with parallel processors. However, when I
>>> run using mpi and valgrind I crashes in various places. For example
>>> some of the times it will crash with a segfault within
>>> MPI_Allgatherv despite the fact that all the arguments to the all
>>> gather on all processors is completely valid. If we don't use
>>> valgrind the program runs just fine.
>>> This is on a Debian(lenny) 64 bit machine using the stock mpi
>>> package. The command used to launch the job is: mpirun -np 8
>>> valgrind -v --log-file=valgrind.%p executable. Are valgrind and
>>> openmpi compatible? Is there any special tricks to getting them to
>>> work together?
>>
>>
>> We use valgrind internally to track down leaks and other debugging
>> kinds of things. So yes, it should work.
>>
>> I do try to keep up with the latest latest latest valgrind, though.
>>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users