Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Ralph Castain (rhc_at_[hidden])
Date: 2006-05-31 14:29:02


Hi Pak

I'm afraid I don't fully understand your question, so forgive me if I
don't seem to address the problem adequately. As I understand it, you
are asking about the scenario where someone wants to execute multiple
calls of mpirun, with the applications executing on the same set of
nodes. Your question is: why does OpenRTE spawn a new daemon (orted) on
the node each time we execute mpirun - why doesn't it just use the
existing one to launch the new application process(es)?

Assuming I have the question right, the short answers are "may not be
permitted" and "not yet implemented". :-)

First, the fact that an orted already exists on a node is not sufficient
to allow us to use it again for another application. The orted must be
persistent or else we do not allow a new application to re-use it. This
is required because the existing orted will go away when its original
application is done executing - if we use it as our parent to launch
another child, then the new application process will "die" when the
original one completes. Obviously, that isn't desirable.

Second, even though you can launch persistent orteds today, none of the
current components in the resource management subsystems actually know
how to use them yet. This is something we planned to implement in the
future, but there simply hasn't been time to do so yet.

So the bottom line is that there really is no way around the need to
launch a new orted on each node every time the user issues an mpirun
command.

I hope that answers your question. If not, please don't hesitate to let
me know.
Ralph

Pak Lui wrote:
> Hi,
>
> When I run a spawn program over rsh/ssh, I notice that each time the
> child program gets spawned, it will need to establish a new rsh/ssh
> connection to the remote node to launch orted on that node, even the
> parent executable and the orted are running on that node.
>
> So I wonder if there is any way that we can use the parent orted to
> launch the child program if they happen to be on the same node?
>
> I try to compare to the spawn program to the scenario where I run
> multiple executables in one mpirun command. For this run, I only
> establish one connection to the remote node only, and both executables
> shared the same remote connection.
>
> % ./mpirun -np 2 -host burl-ct-v440-5 -prefix `pwd`/.. sleep 12 : -np 2
> sleep 10
> Password:
>
> 15015 /workspace/paklui/ompi/trunk/builds/sparc32-g/bin/../bin/orted
> --bootprox
> 15017 sleep 12
> 15019 sleep 12
> 15021 sleep 10
> 15023 sleep 10
>
> The reason that I want to find out if it is possible for orted to launch
> child executable(s) without having to establish a new connection, is
> because the number of times that I can run 'qrsh' in SGE (or N1GE) is
> actually depended on the number of slots that the user initially
> allocated. That the slot number corresponds to the number of CPUs on a
> node. Each slot allows one 'qrsh' connection.
>
> The issue is when I try to run a spawn job on a single node, or a
> cluster of many 1-cpu nodes under SGE. The number of times that the
> program can spawn is limited by 'qrsh', that it forbids the child
> program to connect to the same node where the parent executable's orted
> might be already running there.
>
> I am curious to see if I can find some solution to the problem here. I
> am also looking to see if there are some tricks in SGE to get around
> this issue, but workaround I can see aren't pretty though. So I welcome
> your questions, comments or suggestions on this.
>
>