Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] problems with establishing an intercommunicator
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2011-03-09 10:44:19


The MPI_Comm_connect and MPI_Comm_accept calls are collective over their entire communicators.

So if you pass MPI_COMM_WORLD into MPI_Comm_connect/accept, then *all* processes in those respective MPI_COMM_WORLD's need to call MPI_Comm_connect/accept.

For your 2nd question, when you get this to work, then all processes can send directly to each other -- Open MPI doesn't currently have any "routing" capabilities (e.g., sending through some other process to get to a 3rd process).

On Mar 8, 2011, at 9:40 PM, Waclaw Kusnierczyk wrote:

> Hello,
>
> I'm trying to connect two independent MPI process groups with an intercommunicator, using ports, as described in sec. 10.4 of the MPI standard. One group runs a server, the other a client. The server opens a port, publishes the port's name, and waits for a connection. The client obtains the port's name, and connects to it. The problem is, the code works if both the server and the client are run in a one-process MPI group each. If any of the MPI groups has more than one process, the program hangs.
>
> The following are two fragments of a minimal code example reproducing the problem on my machine. The server:
>
> if (rank == 0) {
> MPI_Open_port(MPI_INFO_NULL, port);
> int fifo = open(argv[1], O_WRONLY);
> write(fifo, port, MPI_MAX_PORT_NAME);
> close(fifo);
> printf("[server] listening on port '%s'\n", port);
> MPI_Comm_accept(port, MPI_INFO_NULL, 0, this, &that);
> printf("[server] connected\n");
> MPI_Close_port(port); }
> MPI_Barrier(this);
>
> and the client:
>
> if (rank == 0) {
> int fifo = open(buffer, O_RDONLY);
> read(fifo, port, MPI_MAX_PORT_NAME);
> close(fifo);
> printf("[client] connecting to port '%s'\n", port);
> MPI_Comm_connect(port, MPI_INFO_NULL, 0, this, &that);
> printf("[client] connected\n"); }
> MPI_Barrier(this);
>
> where 'this' is the local MPI_COMM_WORLD, and the port name is transmitted via a named pipe. (Complete code together with a makefile is attached for reference.)
>
> When the compiled codes are run on one MPI process each:
>
> mkfifo port
> mpirun -np 1 ./server port &
> mpirun -np 1 ./client port
>
> the connection is established as expected. With more than one process on either side, however, the execution blocks at the connect-accept step (i.e., after the 'listening' and 'connecting' messages are printed, but before the 'connected' messages are); using the attached code,
>
> make NS=2 run
>
> or
>
> make NC=2 run
>
> should reproduce the problem.
>
> I'm using OpenMPI on two different machines: 1.4 on a 2-core laptop, and 1.3.3 on a large supercomputer, having the same problem on both. Where do I go wrong?
>
> One more, related question: once I manage to establish an intercommunicator for two multi-process MPI groups, can any process in one group send a message to any process in the other, directly, or does the communication have to go through the root nodes?
>
> Regards,
> Wacek
>
> <rendezvous.tgz>_______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/