Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] Problem with MPI_Barrier (Inter-communicator)
From: Edgar Gabriel (gabriel_at_[hidden])
Date: 2012-04-04 16:04:44


did you try to start the program with the --mca coll ^inter switch that
I mentioned? Collective dup for intercommunicators should work, its
probably again the bcast over a communicator of size 1 that is causing
the hang, and you could avoid it with the flag that I mentioned above.

Also, if you could attach your test code, that would help in hunting
things down.

Thanks
Edgar

On 4/4/2012 2:18 PM, Thatyene Louise Alves de Souza Ramos wrote:
> Hi there.
>
> I've made some tests related to the problem reported by Rodrigo. And I
> think, I'd rather be wrong, that /collective calls like Create and Dup
> do not work with Inter communicators. I've try this in the client group:/
>
> *MPI::Intercomm tmp_inter_comm;*
> *
> *
> *tmp_inter_comm = server_comm.Create (server_comm.Get_group().Excl(1,
> &rank));*
> *
> *
> *if(server_comm.Get_rank() != rank)*
> *server_comm = tmp_inter_comm.Dup();*
> *else*
> *server_comm = MPI::COMM_NULL;*
> *
> *
> The server_comm is the original inter communicator with the server group.
>
> I've noticed that the program hangs in the Dup call. It seems that the
> tmp_inter_comm created without one process still has this process,
> because the other processes are waiting for it call the Dup too.
>
> What do you think?
>
> On Wed, Mar 28, 2012 at 6:03 PM, Edgar Gabriel <gabriel_at_[hidden]
> <mailto:gabriel_at_[hidden]>> wrote:
>
> it just uses a different algorithm which avoids the bcast on a
> communicator of 1 (which is causing the problem here).
>
> Thanks
> Edgar
>
> On 3/28/2012 12:08 PM, Rodrigo Oliveira wrote:
> > Hi Edgar,
> >
> > I tested the execution of my code using the option -mca coll ^inter as
> > you suggested and the program worked fine, even when I use 1 server
> > instance.
> >
> > What is the modification caused by this parameter? I did not find an
> > explanation about the utilization of the module coll inter.
> >
> > Thanks a lot for your attention and for the solution.
> >
> > Best regards,
> >
> > Rodrigo Oliveira
> >
> > On Tue, Mar 27, 2012 at 1:10 PM, Rodrigo Oliveira
> > <rsilva.oliveira_at_[hidden] <mailto:rsilva.oliveira_at_[hidden]>
> <mailto:rsilva.oliveira_at_[hidden]
> <mailto:rsilva.oliveira_at_[hidden]>>> wrote:
> >
> >
> > Hi Edgar.
> >
> > Thanks for the response. I just did not understand why the Barrier
> > works before I remove one of the client processes.
> >
> > I tryed it with 1 server and 3 clients and it worked properly.
> After
> > I removed 1 of the clients, it stops working. So, the removal is
> > affecting the functionality of Barrier, I guess.
> >
> > Anyone has an idea?
> >
> >
> > On Mon, Mar 26, 2012 at 12:34 PM, Edgar Gabriel
> <gabriel_at_[hidden] <mailto:gabriel_at_[hidden]>
> > <mailto:gabriel_at_[hidden] <mailto:gabriel_at_[hidden]>>> wrote:
> >
> > I do not recall on what the agreement was on how to treat
> the size=1
> >
> >
> >
> >
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden] <mailto:users_at_[hidden]>
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden] <mailto:users_at_[hidden]>
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Edgar Gabriel
Associate Professor
Parallel Software Technologies Lab      http://pstl.cs.uh.edu
Department of Computer Science          University of Houston
Philip G. Hoffman Hall, Room 524        Houston, TX-77204, USA
Tel: +1 (713) 743-3857                  Fax: +1 (713) 743-3335