Is it possible to efficiently poll for both incoming messages and
request completion using only one thread? As far as I know, busy
waiting with alternate MPI_Iprobe and MPI_Testsome calls is the only
way to do this. Is that approach dangerous to do performance-wise?
Background: my application is memory constrained, so when requests
complete I may suddenly be able to schedule new computation. At the
same time, I need to be responding to a variety of asynchronous
messages from unknown processors with unknown message sizes, which as
far as I know I can't turn into a request to poll on.