I'm getting many "Source and destination overlap in memcpy" errors when
running my application on an odd number of procs.
I believe this is because the Allgather collective is using Bruck's
algorithm and doing a shift on the buffer as a finalisation step
(coll_tuned_allgather.c):
tmprecv = (char*) rbuf;
tmpsend = (char*) rbuf + (size - rank) * rcount * rext;
err = ompi_ddt_copy_content_same_ddt(rdtype, rank * rcount,
tmprecv, tmpsend);
Unfortunately ompi_ddt_copy_content_same_ddt does a memcpy, instead of
the memmove which is needed here. For this buffer-left-shift, any
forward-copying memcpy should actually be OK as it won't overwrite
itself during the copy, but this violates the precondition of memcpy and
may break for some implementations.
I think this issue was dismissed too lightly previously:
http://www.open-mpi.org/community/lists/users/2007/08/3873.php
Thanks,
Simon
|