Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] RFC: about dynamic/intercomm_create test from ibm test suite
From: Gilles Gouaillardet (gilles.gouaillardet_at_[hidden])
Date: 2014-05-28 07:45:59


On Wed, May 28, 2014 at 8:31 PM, Jeff Squyres (jsquyres)
> To be totally clear: MPI says it is erroneous for only some (not all)
processes in a communicator to call MPI_COMM_FREE. So if that's the real
problem, then the discussion about why the parent(s) is(are) trying to
contact the children is moot -- the test is erroneous, and erroneous
application behavior is undefined.

This is definetly what happens : only some tasks call MPI_Comm_free()
i will commit my changes and the initially reported issue is solved :-)

about the "bonus points" :

v1.8 does not have this issue

i digged it and bottom line, the parent (who did not call MPI_Comm_free
unlike the children) calls ompi_dpm_base_dyn_finalize, which tries to isend
the already exited tasks.

bottom line, in pml_ob1_sendreq.h line 450

with v1,8
mca_bml_base_btl_array_get_size(&endpoint->btl_eager) = 0
nothing is sent but isend is reported successful

with trunk
mca_bml_base_btl_array_get_size(&endpoint->btl_eager) = 1
and then try to send the message => BOUM

i found various things that seem counter intuitive to me and will summarize
all this tomorrow.