Before you worry too much about the inefficiency of using a contiguous scratch buffer to pack into and send from and a second contiguous scratch buffer to receive into and unpack from, it would be worth knowing how OpenMPI processes a discontiguous datatype on your platform.
Gathering outgoing data directly from discontiguous memory to a network interface and scattering incoming data from a network interface directly to discontiguous memory is practical in some cases but not in all. When it is not practical, the fallback inside the MPI implementation can involve allocating scratch buffers under the covers and doing pack/unpack guided by the datatype. If that is what is happening then you can do the pack/unpack at least as efficiently as libmpi interpreting a datatype to do the same thing.
If the data is being passed via shared memory then it should be practical for the MPI implementation to avoid pack/unpack scratch buffers.
The use of a datatype is clearly more elegant and when the MPI implementation is able to avoid intermediate buffering, it is likely to be more efficient as well.
Dick Treumann - MPI Team
IBM Systems & Technology Group
Dept 0lva / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
Tele (845) 433-7846 Fax (845) 433-8363
email@example.com wrote on 08/20/2008 02:07:47 PM:
> Yes that's what I understood after struggling with it over a week. I
> need to send such messages frequently so creating and destroying the
> data type each time may be expensive. What would be the best alternative
> for sending such malloced data ? Though I can always pack the data in a
> long array and unpack at the opposite end as I know the structure of the
> data at each node. Anything more efficient and elegant will be better.
> Thanks for the help.