On Thu, 2014-03-13 at 14:53 -0700, Ross Boylan wrote:
> On Thu, 2014-03-13 at 13:13 -0700, Ross Boylan wrote:
> > I might just switch to mpi.send, though the fact that something is
> > going
> > wrong makes me nervous.
> I tried using mpi.send, but it fails also. The failure behavior is
Actually, I hadn't stamped out all the mpi.isends. When I did, it
worked. From that and further debugging I determined that
1. The messages were being sent and received at the MPI level. After
getting the bytes Rmpi unserializes them; this was failing because the
bytes were corrupt (though of the expected length). This failure
happened before the the message printed out saying the message was
received. The error caused the process to stop working, so it couldn't
get any more messages.
So the immediate failure was on the receiver, not the sender. It is
likely this was triggered by the sender sending something garbled.
2. A small amount of testing with mpi.send finds no errors.
3. Under half the bytes in the delivered stream matched those in the
sent stream. The first apparent discrepancy was on the 10th byte, then
at 5176, and then a block that differed started at 102251. Further the
very end of the stream was not all 0's. I am not certain that I
correctly identified the original stream.
4. Although it's not entirely consistent with 3, my leading theory of
what happened is that the object being sent got garbage collected by R
before it was sent entirely. But I'd expect the first part of the
stream to match (and possibly the last part to be 0, though that's less
likely) if that were true, and that doesn't exactly seem to be the case.
Rmpi's R code for the send is
mpi.isend.Robj <- function(obj, dest, tag, comm=1,request=0)
mpi.isend(x=serialize(obj, NULL), type=4, dest=dest, tag=tag,
The key thing is that serialize() returns the bytes to be sent, and from
R's point of view nobody needs them when the call returns.
mpi.isend is a wrapper for a call to C code, and that C code invokes the