Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] hcoll destruction via MPI attribute
From: George Bosilca (bosilca_at_[hidden])
Date: 2014-01-10 10:04:03


On Jan 10, 2014, at 15:55 , Jeff Squyres (jsquyres) <jsquyres_at_[hidden]> wrote:

> On Jan 10, 2014, at 9:49 AM, George Bosilca <bosilca_at_[hidden]> wrote:
>
>> As I said, this is the case today. There are ongoing discussion in the MPI Forum to relax the wording of the MPI_Comm_free as most of the MPI implementations do not rely on the strict “collective” behavior of the MPI_Comm_free (in the sense that it has to be called by all processes but not necessarily in same time).
>
> That will be an interesting discussion. I look forward to your proposal. :-)

? We already had this discussion in the context of another proposal. Anyway that’s an MPI Forum issue.

>>> I still agree with this point, though — even though COMM_FREE is collective, you could still get into ordering / deadlock issues if you're (effectively) doing communication inside it.
>>
>> As long as the call is collective and the same attributes exists on all communicators I don’t see how the deadlock is possible. My wording was more a precaution for the future than a restriction for today.
> Here's an example:
>
> -----
> MPI Comm comm;
> // comm is setup as an hcoll-enabled communicator
> if (rank == x) {
> MPI_Send(..., y, tag, MPI_COMM_WORLD);
> MPI_Comm_free(comm);
> } else if (rank == y) {
> MPI_Comm_free(comm);
> MPI_Recv(..., x, tag, MPI_COMM_WORLD);
> }
> ------
>
> If the hcoll teardown in the COMM_FREE blocks waiting for all of its peer COMM_FREEs in other processes in the communicator (e.g., due to blocking communication), rank x may block in MPI_SEND waiting for rank y’s MPI_RECV, and therefore never invoke its COMM_FREE.

Based on today’s MPI standard this code is incorrect as the MPI_Comm_free is collective, and you can’t have matching blocking communications crossing a collective line.

  George.

>
> --
> Jeff Squyres
> jsquyres_at_[hidden]
> For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel