Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] OMPI/ORTE and tools
From: Ralph Castain (rhc_at_[hidden])
Date: 2008-01-16 11:25:22

Hi Josh

I already converted orte-ps and orte-clean in the tmp/rhc-step2b branch on
OMPI's svn repository. Shouldn't be hard to convert the checkpoint/restart
tools to use it too - I may have already done some of that work, but I may
not be remembering it correctly.

I'll do some cleanup on the code in my private repository and put the rest
of the implementation in the svn repository next week. I mostly just needed
to talk to Jeff this morning about setting up the comm library - he pointed
out that if I create a special "orte_tool_init" function that only calls
what is needed, then the linker won't bring everything else into the
executable, so a separate "library" may not be required. Still needs to be
tested to ensure I can make that work as neatly as desired.

Appreciate the feedback

On 1/16/08 8:58 AM, "Josh Hursey" <jjhursey_at_[hidden]> wrote:

> Ralph,
> This looks interesting. Can you point me to the header files and any
> ORTE tools that you may have converted to use this library already
> (e.g., orte-ps)? I can port the checkpoint/restart tools to this
> library and start sending you some feedback on the API.
> Cheers,
> Josh
> On Jan 16, 2008, at 7:47 AM, Ralph Castain wrote:
>> Hello all
>> Summary: this note provides a brief overview of how various tools can
>> interface to OMPI applications once the next version of ORTE is
>> integrated
>> into the trunk. It includes a request for input regarding any needs
>> (e.g.,
>> additional commands to be supported in the interface) that have not
>> been
>> adequately addressed.
>> As many of you know, I have been working on a tmp branch to
>> complete the
>> revamp of ORTE that has been in progress for quite some time. Among
>> other
>> things, this revamp is intended to simplify the system, provide
>> enhanced
>> scalability, and improved reliability.
>> As part of that effort, I have extensively revised the support for
>> external
>> tools. In the past, tools such as the Eclipse PTP could only
>> interact with
>> Open MPI-based applications via ORTE API's, thus exposing the tool
>> to any
>> changes in those APIs. Most tools, however, do not require the
>> level of
>> control provided by the APIs and can benefit from a simplified
>> interface.
>> Accordingly, the revamped ORTE now offers alternative methods of
>> interaction. The primary change has been the creation of a
>> communications
>> library with a simple serial protocol for interacting with OMPI
>> jobs. Thus,
>> tools now have three choices for interacting with OMPI jobs:
>> 1. I have created a new communications library that tools can link
>> against.
>> It does not include all of the ORTE or OMPI libraries, so it is a
>> very small
>> memory footprint. Besides the usual calls to initialize and
>> finalize, the
>> library contains utilities for finding all of the OMPI jobs running
>> on that
>> HNP (i.e., all OMPI jobs whose mpirun was executed from that host),
>> querying
>> the status of a job (provides the job map plus all proc states);
>> querying
>> the status of nodes (provides node names, status, and list of procs
>> on each
>> node including their state); querying the status of a specific
>> process;
>> spawning a new job; and terminating a job. In addition, you can
>> attach to
>> output streams of any process, specifying stdout, stderr, or both -
>> this
>> "tees" the specified streams, so it won't interfere with the job's
>> normal
>> output flow.
>> I could also create a utility to allow attachment to the input
>> stream of a
>> process. However, I'm a little concerned about possible conflicts with
>> whatever is already flowing across that stream. I would appreciate any
>> suggestions as to whether or not to provide that capability.
>> Note: we removed the concept of the ORTE "universe", so a tool can
>> now talk
>> to any mpirun without complications. Thus, tools can simultaneously
>> "connect" to and monitor multiple mpiruns, if desired.
>> 2. link against all of OMPI or ORTE, and execute a standalone
>> program. In
>> this mode, your tool would act as a surrogate for mpirun by directly
>> spawning the user's application. This provides some flexibility,
>> but it does
>> mean that both the tool and the job -must- end together, and that
>> the tool
>> may need to be revised whenever OMPI/ORTE APIs are updated.
>> 3. link against all of OMPI or ORTE, executing as a distributed set of
>> processes. In this mode, you would execute your tool via "mpirun -
>> pernode
>> ./my_tool" (or whatever command is appropriate - this example would
>> launch
>> one tool process on every node in the allocation). If the tool
>> processes
>> need to communicate with each other, they can call MPI_Init or
>> orte_init,
>> depending upon the level of desired communication. Note that the
>> tool job
>> will be completely standalone from the application job and must be
>> terminated separately.
>> In all of these cases, it is possible for tool processes to connect
>> (via MPI
>> and/or ORTE-RML) to a job's processes provided that the application
>> supports
>> it.
>> I can provide more details, of course, to anyone wishing them. What
>> I would
>> appreciate, though, is any feedback about desired commands, mode of
>> operation, etc. that I might have missed or people would prefer be
>> changed.
>> This code is all in a private repository for my tmp branch, but I
>> expect
>> that to merge with the trunk fairly soon. I have provided a couple of
>> example tools to illustrate the above modes of operation in that code.
>> Thanks
>> Ralph
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]