On 11/11/10 10:56, Jed Brown wrote:
On Thu, Nov 11, 2010 at 11:45, Number Cruncher <number.cruncher@ntlworld.com> wrote:
Having just replaced the memcpy with Linus safe forward-copy version from https://bugzilla.redhat.com/show_bug.cgi?id=638477#c38 I can report no more problems with my Open MPI program which was previously behaving unpredictably after calls to memcpy with overlapping ranges.

Do you happen to have a test case?  I am running glibc-2.12.1 on 64-bit Arch Linux and although valgrind reports the overlapping memcpy, I have not yet noticed incorrect results or crashes.

Jed
Unfortunately, I don't have a test case I can send; an actual problem only manifested itself when running one of our commericial applications on a fresh F14 install (dual Xeon E5620).

However as commented here: https://bugzilla.redhat.com/show_bug.cgi?id=638477#c86 the valgrind memcpy implementation is overlap-safe.

Are you using an Intel Nehalem-class CPU? The bug was also only temperamental for me; I'm not entirely sure why. It would hang in unmatched collectives 60-80% of the times run. With a forward memcpy, it never hung.

I can provide a thought test case. Consider source and destination where destination is 1 byte before source:
SRC:  ABCD
DST: Xabc

Copy forward memcpy:
SRC:  ABCD
DST: Aabc
DST: ABbc
DST: ABCc
DST: ABCD

Copy backward memcpy:
SRC:  ABCD
DST: XabD
DST: XaDD
DST: XDDD
DST: DDDD   (WRONG)