Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] memcpy overlap in ompi_ddt_copy_content_same_ddt and glibc 2.12
From: Number Cruncher (number.cruncher_at_[hidden])
Date: 2010-11-11 06:36:04


On 11/11/10 10:56, Jed Brown wrote:
> On Thu, Nov 11, 2010 at 11:45, Number Cruncher
> <number.cruncher_at_[hidden] <mailto:number.cruncher_at_[hidden]>>
> wrote:
>
> Having just replaced the memcpy with Linus safe forward-copy
> version from
> https://bugzilla.redhat.com/show_bug.cgi?id=638477#c38 I can
> report no more problems with my Open MPI program which was
> previously behaving unpredictably after calls to memcpy with
> overlapping ranges.
>
>
> Do you happen to have a test case? I am running glibc-2.12.1 on
> 64-bit Arch Linux and although valgrind reports the overlapping
> memcpy, I have not yet noticed incorrect results or crashes.
>
> Jed
Unfortunately, I don't have a test case I can send; an actual problem
only manifested itself when running one of our commericial applications
on a fresh F14 install (dual Xeon E5620).

However as commented here:
https://bugzilla.redhat.com/show_bug.cgi?id=638477#c86 the valgrind
memcpy implementation is overlap-safe.

Are you using an Intel Nehalem-class CPU? The bug was also only
temperamental for me; I'm not entirely sure why. It would hang in
unmatched collectives 60-80% of the times run. With a forward memcpy, it
never hung.

I can provide a thought test case. Consider source and destination where
destination is 1 byte before source:
SRC: ABCD
DST: Xabc

Copy forward memcpy:
SRC: ABCD
DST: Aabc
DST: ABbc
DST: ABCc
DST: ABCD

Copy backward memcpy:
SRC: ABCD
DST: XabD
DST: XaDD
DST: XDDD
DST: DDDD (WRONG)