On Apr 23, 2014, at 4:45 PM, Ross Boylan <ross_at_[hidden]> wrote:
>> is OK. So, if any nonblocking calls are used, one must use mpi.test or
>> mpi.wait to check if they are complete before trying any blocking calls.
That is also correct -- it's MPI semantics (communications initiated by MPI_Isend / MPI_Irecv must be completed via one of the flavors of MPI_Test or MPI_Wait).
> That sounds like a different problem than the one I encountered. The
> system did get hung up, but the reason was that processes received
> corrupted R objects, threw an error, and stopped responding.
> The root of my problem was that objects got garbage collected before the
> isend completed.
This is definitely a problem with garbage collecting languages. MPI needs to control the buffer until the corresponding Test/Wait indicates that MPI has finished with the buffer.
If the buffer disappears / is changed from underneath MPI, unpredictable/undefined behavior can certainly result.
> This will happen regardless of subsequent R-level
> calls (e.g., to mpi.test). The object to be transmitted is serialized
> and passed to C, but when the call returns there are no R references to
> the object--that is, the serialized version of the object--and so it is
> subject to garbage collection.
> I'd be happy to provide my modifications to get around this. Although
> they worked for me, they are not really suitable for general use. There
> are 2 main issues: first, I ignored the asynchronous receive since I
> didn't use it. Since MPI request objects are used for both sending and
> receiving, I suspect that mixing irecv's in with code doing isends would
> not work right.
FWIW, this works fine. It's quite common (in C and Fortran) to mix various kinds of MPI_Request handles into a single array-based Test or Wait. MPI figures it out.
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/