Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

From: Ralph Castain (rhc_at_[hidden])
Date: 2007-07-27 10:13:57

On 7/27/07 7:58 AM, "Terry D. Dontje" <Terry.Dontje_at_[hidden]> wrote:

> Ralph Castain wrote:
>> WHAT: Proposal to add two new command line options that will allow us to
>> replace the current need to separately launch a persistent daemon to
>> support connect/accept operations
>> WHY: Remove problems of confusing multiple allocations, provide a cleaner
>> method for connect/accept between jobs
>> WHERE: minor changes in orterun and orted, some code in rmgr and each pls
>> to ensure the proper jobid and connect info is passed to each
>> app_context as it is launched
> It is my opinion that we would be better off attacking the issues of
> the persistent daemons described below then creating a new set of
> options to mpirun for process placement. (more comments below on
> the actual proposal).

Non-trivial problems - we haven't figured them out in three years of
occasional effort. It isn't clear that they even -can- be solved when
considering the problem of running in multiple RM-based allocations.

I'll try to provide more detail on the problems when I return from my quick

>> TIMOUT: 8/10/07
>> We currently do not support connect/accept operations in a clean way. Users
>> are required to first start a persistent daemon that operates in a
>> user-named universe. They then must enter the mpirun command for each
>> application in a separate window, providing the universe name on each
>> command line. This is required because (a) mpirun will not run in the
>> background (in fact, at one point in time it would segfault, though I
>> believe it now just hangs), and (b) we require that all applications using
>> connect/accept operate under the same HNP.
>> This is burdensome and appears to be causing problems for users as it
>> requires them to remember to launch that persistent daemon first -
>> otherwise, the applications execute, but never connect. Additionally, we
>> have the problem of confused allocations from the different login sessions.
>> This has caused numerous problems of processes going to incorrect locations,
>> allocations timing out at different times and causing jobs to abort, etc.
>> What I propose here is to eliminate the confusion in a manner that minimizes
>> code complexity. The idea is to utilize our so-painfully-developed multiple
>> app_context capability to have the user launch all the interacting
>> applications with the same mpirun command. This not only eliminates the
>> annoyance factor for users by eliminating the need for multiple steps and
>> login sessions, but also solves the problem of ensuring that all
>> applications are running in the same allocation (so we don't have to worry
>> any more about timeouts in one allocation aborting another job).
>> The proposal is to add two command line options that are associated with a
>> specific app_context (feel free to redefine the name of the option - I don't
>> personally care):
>> 1. --independent-job - indicates that this app_context is to be launched as
>> an independent job. We will assign it a separate jobid, though we will map
>> it as part of the overall command (e.g., if by slot and no other directives
>> provided, it will start mapping where the prior app_context left off)
> I am unclear what does the option --connect really do? The MPI codes
> actually
> have to call MPI_Comm_connect to really connect to a process. Can we
> get away
> with just the above option?

You are right - connect doesn't need to exist. I was thinking it would just
minimize the startup message as I wouldn't bother sharing RTE info across
jobs that weren't "connected". However, for MPI users, this probably would
be confusing, so I would suggest just dropping it. With the routed rml, it
won't have that much impact anyway (I think).

>> 2. --connect x,y,z - only valid when combined with the above option,
>> indicates that this independent job is to be MPI-connected to app_contexts
>> x,y,z (where x,y,z are the number of the app_context, counting from the
>> beginning of the command - you choose if we start from 0 or 1).
>> Alternatively, we can default to connecting to everyone, and then use
>> --disconnect to indicate we -don't- want to be connected.
>> Note that this means the entire allocation for the combined app_contexts
>> must be provided. This helps the RTE tremendously to keep things straight,
>> and ensures that all the app_contexts will be able to complete (or not) in a
>> synchronized fashion.
>> It also allows us to eliminate the persistent daemon and multiple login
>> session requirements for connect/accept. That does not mean we cannot have a
>> persistent daemon to create a virtual machine, assuming we someday want to
>> support that mode of operation. This simply removes the requirement that the
>> user start one just so they can use connect/accept.
>> Comments?
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
> _______________________________________________
> devel mailing list
> devel_at_[hidden]