Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] invalid write in opal_generic_simple_unpack
From: Patrik Jonsson (code_at_[hidden])
Date: 2012-03-14 16:06:41


On Wed, Mar 14, 2012 at 3:43 PM, Jeffrey Squyres <jsquyres_at_[hidden]> wrote:
> On Mar 14, 2012, at 9:38 AM, Patrik Jonsson wrote:
>
>> I'm trying to track down a spurious segmentation fault that I'm
>> getting with my MPI application. I tried using valgrind, and after
>> suppressing the 25,000 errors in PMPI_Init_thread and associated
>> Init/Finalize functions,
>
> I haven't looked at these in a while, but the last time I looked, many/most of them came from one of several sources:
>
> - OS-bypass network mechanisms (i.e., the memory is ok, but valgrind isn't aware of it)
> - weird optimizations from the compiler (particularly from non-gcc compilers)
> - weird optimizations in glib or other support libraries
> - Open MPI sometimes specifically has "holes" of uninitialized data that we memcpy (long story short: it can be faster to copy a large region that contains a hole rather than doing 2 memcopies of the fully-initialized regions)
>
> Other than what you cited below, are you seeing others?  What version of Open MPI is this?  Did you --enable-valgrind when you configured Open MPI?  This can reduce a bunch of these kinds of warnings.

I didn't install OpenMPI myself, but I doubt it was configured with this.

>
>> I'm left with an uninitialized write in
>> PMPI_Isend (which I saw is not unexpected), plus this:
>>
>> ==11541== Thread 1:
>> ==11541== Invalid write of size 1
>> ==11541==    at 0x4A09C9F: _intel_fast_memcpy (mc_replace_strmem.c:650)
>
> That doesn't seem right.  It's an *invalid* write, not an *uninitialized* access.  Could be serious.
>
>> ==11541==    by 0x5093447: opal_generic_simple_unpack
>> (opal_datatype_unpack.c:420)
>> ==11541==    by 0x508D642: opal_convertor_unpack (opal_convertor.c:302)
>> ==11541==    by 0x4F8FD1A: mca_pml_ob1_recv_frag_callback_match
>> (pml_ob1_recvfrag.c:217)
>> ==11541==    by 0x4ED51BD: mca_btl_tcp_endpoint_recv_handler
>> (btl_tcp_endpoint.c:718)
>> ==11541==    by 0x509644F: opal_event_loop (event.c:766)
>> ==11541==    by 0x507FA50: opal_progress (opal_progress.c:189)
>> ==11541==    by 0x4E95AFE: ompi_request_default_test (req_test.c:88)
>> ==11541==    by 0x4EB8077: PMPI_Test (ptest.c:61)
>> ==11541==    by 0x78C4339: boost::mpi::request::test() (in
>> /n/home00/pjonsson/lib/libboost_mpi.so.1
>> .48.0)
>
> It looks like this is happening in the TCP receive handler; it received some data from a TCP socket and is trying to copy it to the final, MPI-specified receive buffer.
>
> If you can attach the debugger here, per chance, it might be useful to verify that OMPI is copying to the target buffer that was assumedly specified in a prior call to MPI_IRECV (and also double check that this buffer is still valid).

The problem was that there were many sends and this error was
spurious, so it was hard to know whether I stopped in the right
unpack.

I think I tracked it down, though. The problem was in the boost.mpi
"skeleton/content" feature (which has bitten me in the past).
Essentially, any serialization operator that uses a temporary will
silently give incorrect results when using skeleton/content, because
the get_content operator captures the location of the temporary when
building the custom MPI data type, which then causes the data to get
deposited in some invalid location.

There is scant documentation of this feature and the above conclusion
is my own, but I'm pretty sure it's correct. Even the built-in boost
serializations aren't safe. Serializing an enum, for example, uses a
temporary and will thus not work correctly with these operators.

> Is there any chance that you can provide a small reproducer in C without all the Boost stuff?

As is clear from the above, no. The problem was in my code and boost.

I do have a more general question, though: Is there a good way to back
out the location of the request object if I stop deep in the bowels of
MPI. As I understand it, just because the user-level call is a certain
MPI_Test doesn't mean that under the hood it's working on other
requests, but this nonlocality makes it difficult to track down
errors.

Thanks,

/Patrik