Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Problem with MPI_Barrier (Inter-communicator)
From: Rodrigo Oliveira (rsilva.oliveira_at_[hidden])
Date: 2012-03-20 14:40:49


Hi Edgar.

Thanks for the response. The simplified code is attached: server, client
and a .h containing some constants. I put some "prints" to show the
behavior.

Regards

Rodrigo

On Tue, Mar 20, 2012 at 11:47 AM, Edgar Gabriel <gabriel_at_[hidden]> wrote:

> do you have by any chance the actual or a small reproducer? It might be
> much easier to hunt the problem down...
>
> Thanks
> Edgar
>
> On 3/19/2012 8:12 PM, Rodrigo Oliveira wrote:
> > Hi there.
> >
> > I am facing a very strange problem when using MPI_Barrier over an
> > inter-communicator after some operations I describe bellow:
> >
> > 1) I start a server calling mpirun.
> > 2) The server spawns 2 copies of a client using MPI_Comm_spawn, creating
> > an inter-communicator between the two groups. The server group with 1
> > process (lets name it as A) and the client group with 2 processes (group
> B).
> > 3) After that, I need to detach one of the processes (rank 0) in group B
> > from the inter-communicator AB. To do that I do the following steps:
> >
> > Server side:
> > .....
> > tmp_inter_comm = client_comm.Create ( client_comm.Get_group ( )
> );
> > client_comm.Free ( );
> > client_comm = tmp_inter_comm;
> > .....
> > client_comm.Barrier();
> > .....
> >
> > Client side:
> > ....
> > rank = 0;
> > tmp_inter_comm = server_comm.Create ( server_comm.Get_group (
> > ).Excl ( 1, &rank ) );
> > server_comm.Free ( );
> > server_comm = tmp_inter_comm;
> > .....
> > if (server_comm != MPI::COMM_NULL)
> > server_comm.Barrier();
> >
> >
> > The problem: everything works fine until the call to Barrier. In that
> > point, the server exits the barrier, but the client at the group B does
> > not. Observe that we have only one process inside B, because I used Excl
> > to remove one process from the original group.
> >
> > p.s.: This occurs in the version 1.5.4 and the C++ API.
> >
> > I am very concerned about this problem because this solution plays a
> > very important role in my master thesis.
> >
> > Is this an ompi problem or am I doing something wrong?
> >
> > Thanks in advance
> >
> > Rodrigo Oliveira
> >
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>