Looking at this a little closer on the v1.2 branch, it does look like
it could be a bug.
The child definitely does not return from INTERCOMM_MERGE until the
parent enters MPI_RECV. So I put in a bogus MPI_TEST call before the
parent calls MPI_RECV, and that also causes the child the return from
INTERCOMM_MERGE. That makes it sound like we have something that is
not finishing progress properly before leaving INTERCOMM_MERGE;
calling progress again (e.g., calling MPI_TEST or the MPI_RECV) makes
enough happen that allows the children to complete the
To be honest, I don't think we'll be too motivated to fix this in the
old v1.2 series because we're getting darn close to putting out v1.3.
Support for the dynamics and the progression engine have changed a
*lot* behind the scenes in v1.3.
To be specific: this problem doesn't seem to happen in the code for
the upcoming v1.3 release, however (I would not encourage using a
nightly snapshot at the moment; we have a fairly gnarly bug in other
kinds of progression issues that needs to be fixed).
On Jul 28, 2008, at 5:02 PM, Jeff Squyres wrote:
> On Jul 28, 2008, at 4:56 PM, Aurélien Bouteiller wrote:
>> Having different values is fine for high parameter.
>> I think the problem comes from using NULL, NULL instead of &argc,
>> &argv as parameters for MPI_Init.
> Calling MPI_INIT with NULL, NULL is legal; we don't actually do
> anything with those values, IIRC.
> Jeff Squyres
> Cisco Systems
> users mailing list