Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] How to specify hosts for MPI_Comm_spawn
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2008-07-30 11:49:21

On Jul 30, 2008, at 11:12 AM, Mark Borgerding wrote:

> I appreciate the suggestion about running a daemon on each of the
> remote nodes, but wouldn't I kind of be reinventing the wheel there?
> Process management is one of the things I'd like to be able to count
> on ORTE for.

Keep in mind that the daemons here are not for process management --
they're for name service.

> Would the following work to give the parent process an intercomm
> with each child?
> parent i.e. my non-mpirun-started process calls MPI_Init then
> MPI_Open_port
> parent spawns mpirun command via system/exec to create the remote
> children . The name from MPI_Open_port is placed in the environment.
> parent calls MPI_Comm_accept (once for each child?)
> all children call MPI_connect to the name

It may be problematic to call system/exec in some environments (e.g.,
if using OpenFabrics networks). Bad Things can happen.

> I think this would give one intercommunicator back to the parent for
> each remote process (not ideal, but I can worry about broadcast data
> later)
> The remote processes can communicate to each other through
> Actually when I think through the details, much of this is pretty
> similar to the daemon MPI_Publish_name+MPI_Lookup_name approach.
> The main difference being which processes come first.

Instead of having the framework call MPI_Init in your plugin, can you
plugin system/exec "mpirun -np 1 my_parent_app"? And perhaps use a
pipe (or socket or some other IPC) to communicate between the
framework process and my_parent_app? I realize it's a kludgey
workaround, but it looks like we clearly have a bug in the 1.2 series
with singletons in this area...

Jeff Squyres
Cisco Systems