If you use any kind of high performance network that require memory registration for communications, then this high cost for the MPI_Alloc_mem will be hidden by the communications. However, the MPI_Alloc_mem function seems horribly complicated to me, as we do the whole "find-the-right-allocator" step every time instead of caching it. While this might be improved, I'm pretty sure the major part of the overhead comes from the registration itself.
The MPI_Alloc_mem function allocate the memory and then it register it with the high speed interconnect (Infiniband as an example). If you don't have IB, then this should not happens. You can try to force the mpool to nothing, or disable the pinning (mpi_leave_pinned=0,mpi_leave_pinned_pipeline=0) to see if this affect the performances.
On Apr 22, 2010, at 08:50 , Pascal Deveze wrote:
> Hi all,
> The sendrecv_replace in Open MPI seems to allocate/free memory with MPI_Alloc_mem()/MPI_Free_mem()
> I measured the time to allocate/free a buffer of 1MB.
> MPI_Alloc_mem/MPI_Free_mem take 350us while malloc/free only take 8us.
> malloc/free in ompi/mpi/c/sendrecv_replace.c was replaced by MPI_Alloc_mem/MPI_Free_mem with this commit :
> user: twoodall
> date: Thu Sep 22 16:43:17 2005 +0000
> summary: use MPI_Alloc_mem/MPI_Free_mem for internally allocated buffers
> Is there a real reason to use these functions or can we move back to malloc/free ?
> Is there a problem on my configuration explaining such slow performance with MPI_Alloc_mem ?
> devel mailing list