This web mail archive is frozen.
This page is part of a frozen web archive of this mailing list.
You can still navigate around this archive, but know that no new mails
have been added to it since July of 2016.
Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.
On Wed, May 28, 2014 at 8:31 PM, Jeff Squyres (jsquyres)
> To be totally clear: MPI says it is erroneous for only some (not all)
processes in a communicator to call MPI_COMM_FREE. So if that's the real
problem, then the discussion about why the parent(s) is(are) trying to
contact the children is moot -- the test is erroneous, and erroneous
application behavior is undefined.
This is definetly what happens : only some tasks call MPI_Comm_free()
i will commit my changes and the initially reported issue is solved :-)
about the "bonus points" :
v1.8 does not have this issue
i digged it and bottom line, the parent (who did not call MPI_Comm_free
unlike the children) calls ompi_dpm_base_dyn_finalize, which tries to isend
the already exited tasks.
bottom line, in pml_ob1_sendreq.h line 450
mca_bml_base_btl_array_get_size(&endpoint->btl_eager) = 0
nothing is sent but isend is reported successful
mca_bml_base_btl_array_get_size(&endpoint->btl_eager) = 1
and then try to send the message => BOUM
i found various things that seem counter intuitive to me and will summarize
all this tomorrow.