Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI devel] RFC: about dynamic/intercomm_create test from ibm test suite
From: Jeff Squyres (jsquyres) (jsquyres_at_[hidden])
Date: 2014-05-27 20:33:45

Note that MPI says that COMM_DISCONNECT simply disconnects that individual communicator. It does *not* guarantee that the processes involved will be fully disconnected.

So I think that the freeing of communicators is good app behavior, but it is not required by the MPI spec.

If OMPI is requiring this for correct termination, then something is wrong. MPI_FINALIZE is supposed to be collective across all connected MPI procs -- and if the parent and spawned procs in this test are still connected (because they have not disconnected all communicators between them), the FINALIZE is supposed to be collective across all of them.

This means that FINALIZE is allowed to block if it needs to, such that OMPI sending control messages to procs that are still "connected" (in the MPI sense) should never cause a race condition.

As such, this sounds like an OMPI bug.

On May 27, 2014, at 2:27 AM, Gilles Gouaillardet <gilles.gouaillardet_at_[hidden]> wrote:

> Folks,
> currently, the dynamic/intercomm_create test from the ibm test suite output the following messages :
> dpm_base_disconnect_init: error -12 in isend to process 1
> the root cause it task 0 tries to send messages to already exited tasks.
> one way of seeing things is that this is an application issue :
> task 0 should have MPI_Comm_free'd all its communicator before calling MPI_Comm_disconnect.
> This can be achieved via the attached patch
> an other way of seeing things is that this is a bug in OpenMPI.
> In this case, what would be the the right approach ?
> - automatically free communicators (if needed) when MPI_Comm_disconnect is invoked ?
> - simply remove communicators (if needed) from ompi_mpi_communicators when MPI_Comm_disconnect is invoked ?
> /* this causes a memory leak, but the application can be seen as responsible of it */
> - other ?
> Thanks in advance for your feedback,
> Gilles
> <intercomm_create.patch>_______________________________________________
> devel mailing list
> devel_at_[hidden]
> Subscription:
> Link to this post:

Jeff Squyres
For corporate legal information go to: