Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Connect/Accept and Disconnect
From: Ralph Castain (rhc_at_[hidden])
Date: 2010-12-21 09:12:35


Are you using ompi-server for pub/sub, or just letting it default to mpirun?

You might want to output the return value from lookup_name and publish_name to see if they match. If they are different, then you will definitely hang.

On Dec 21, 2010, at 6:41 AM, Suraj Prabhakaran wrote:

> Hello,
>
> This is basically a repost of my previous mail regarding problems with connect/accept and disconnect (*this is not related to spawning, parent/child*).
> I *sometimes* find processes blocking indefinitely at Connect/Accept calls or at Disconnect calls. I have an example below.
>
> Process A
> {
> MPI_Open_port(...);
> MPI_Publish_name(...);
> MPI_Comm_accept(... &b_comm); // -----> (1)
> // Do something1
> MPI_Comm_disconnect(&b_comm); // ------> (2)
> // Do something2
>
> }
>
> Process B
> {
> MPI_Lookup_name(...);
> MPI_Comm_connect(... &a_comm); // -----> (1)
> // Do something1
> MPI_Comm_disconnect(&a_comm); // ------> (2)
> // Do something2
> }
>
> In the above scenario, in a perfect case where A reaches (1) without any problems, *sometimes* B blocks at its (1) indefinitely. All arguments passed to both the functions are perfect.
> Again, *sometimes* one of them block infinitely at (2) while the other goes on to do the something2. This could only be a problem at the application level only if the one that blocks indefinitely is always the same but it is not so. Sometimes A blocks and B is busy doing something2 or A is busy doing its something2 while B blocks.
>
> Is this a known issue? or am I the only person experiencing this and is clean for others who frequently use connect/accept/disconnect calls?
>
> Thanks,
> Suraj
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel