Given the limited scope, would it make sense to somehow add this to the trace library (or a separate debug lib) - i.e., can we do it via a lib that inserts itself between the MPI binding and PMPI call? I would hate to duplicate the code in something like sendrecv, but I wonder if we could refactor that to allow for this added capability.
Just a thought. It would allow someone to switch back-and-forth without recompiling or switching MPI modules.
WHAT: MCA parameter for converting all standard mode MPI sends to synchronous mode sends
WHY: helpful in debugging user apps
WHERE: here's the output from "svn st"
WHEN: could be 1.3.4, could be 1.5 -- don't really care which (there's no rush)
TIMEOUT: COB Friday 14 Aug 2009
A feature we've long talked about is having an MCA parameter to switch all standard mode MPI sends to synchronous mode sends (MPI_SEND, MPI_ISEND, MPI_SEND_INIT, MPI_SENDRECV). This helps users identify that their application relies on internal MPI buffering.
Sam from LANL took a crack at implementing this; attached is the patch.
The only concern I have about this patch (echoed by Brian to me in IM) is that it replaces a compile-time constant with a variable lookup in the critical performance code path -- we have to look up the value of a new global variable during MPI_SEND to determine if the send is going to be _SEND_STANDARD or _SEND_SYNCHRONOUS. This could cause a cache miss.
Brian suggested making this stuff compile-out-able via some --configure-cmd-line-switch, similar to how the MPI parameter checking stuff is done (i.e., configure specifies either: always force sync, never force sync, or force to sync based on an MCA parameter value at runtime). This is certainly do-able. However, I'm sending this RFC just in case anyone can think of a better way. Having a compile-time option to effectively remove this capability works fine, but it does reduce the usability of this feature: you effectively have to link your app against a different libmpi.so in order to turn it on.
Does anyone have any suggestions? Or are we stuck with compile-time checking?
devel mailing list