Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] OMPI/ORTE and tools
From: Ralph Castain (rhc_at_[hidden])
Date: 2008-01-16 11:31:59

I should have also stated that one other difference in the "tool" setup is
that the tool does not create a session directory (you can override that if
you need one). This simplifies cleanup of the tool, and helps with tools
like orte-clean. Note that the session directory's main function is to house
the shared memory backing files and other temporary files that typically
aren't used by tools as they are more MPI-specific.

So the default behavior is to -not- create the session dir - like I said,
though, you can override that if you need it.


On 1/16/08 9:25 AM, "Ralph Castain" <rhc_at_[hidden]> wrote:

> Hi Josh
> I already converted orte-ps and orte-clean in the tmp/rhc-step2b branch on
> OMPI's svn repository. Shouldn't be hard to convert the checkpoint/restart
> tools to use it too - I may have already done some of that work, but I may
> not be remembering it correctly.
> I'll do some cleanup on the code in my private repository and put the rest
> of the implementation in the svn repository next week. I mostly just needed
> to talk to Jeff this morning about setting up the comm library - he pointed
> out that if I create a special "orte_tool_init" function that only calls
> what is needed, then the linker won't bring everything else into the
> executable, so a separate "library" may not be required. Still needs to be
> tested to ensure I can make that work as neatly as desired.
> Appreciate the feedback
> Ralph
> On 1/16/08 8:58 AM, "Josh Hursey" <jjhursey_at_[hidden]> wrote:
>> Ralph,
>> This looks interesting. Can you point me to the header files and any
>> ORTE tools that you may have converted to use this library already
>> (e.g., orte-ps)? I can port the checkpoint/restart tools to this
>> library and start sending you some feedback on the API.
>> Cheers,
>> Josh
>> On Jan 16, 2008, at 7:47 AM, Ralph Castain wrote:
>>> Hello all
>>> Summary: this note provides a brief overview of how various tools can
>>> interface to OMPI applications once the next version of ORTE is
>>> integrated
>>> into the trunk. It includes a request for input regarding any needs
>>> (e.g.,
>>> additional commands to be supported in the interface) that have not
>>> been
>>> adequately addressed.
>>> As many of you know, I have been working on a tmp branch to
>>> complete the
>>> revamp of ORTE that has been in progress for quite some time. Among
>>> other
>>> things, this revamp is intended to simplify the system, provide
>>> enhanced
>>> scalability, and improved reliability.
>>> As part of that effort, I have extensively revised the support for
>>> external
>>> tools. In the past, tools such as the Eclipse PTP could only
>>> interact with
>>> Open MPI-based applications via ORTE API's, thus exposing the tool
>>> to any
>>> changes in those APIs. Most tools, however, do not require the
>>> level of
>>> control provided by the APIs and can benefit from a simplified
>>> interface.
>>> Accordingly, the revamped ORTE now offers alternative methods of
>>> interaction. The primary change has been the creation of a
>>> communications
>>> library with a simple serial protocol for interacting with OMPI
>>> jobs. Thus,
>>> tools now have three choices for interacting with OMPI jobs:
>>> 1. I have created a new communications library that tools can link
>>> against.
>>> It does not include all of the ORTE or OMPI libraries, so it is a
>>> very small
>>> memory footprint. Besides the usual calls to initialize and
>>> finalize, the
>>> library contains utilities for finding all of the OMPI jobs running
>>> on that
>>> HNP (i.e., all OMPI jobs whose mpirun was executed from that host),
>>> querying
>>> the status of a job (provides the job map plus all proc states);
>>> querying
>>> the status of nodes (provides node names, status, and list of procs
>>> on each
>>> node including their state); querying the status of a specific
>>> process;
>>> spawning a new job; and terminating a job. In addition, you can
>>> attach to
>>> output streams of any process, specifying stdout, stderr, or both -
>>> this
>>> "tees" the specified streams, so it won't interfere with the job's
>>> normal
>>> output flow.
>>> I could also create a utility to allow attachment to the input
>>> stream of a
>>> process. However, I'm a little concerned about possible conflicts with
>>> whatever is already flowing across that stream. I would appreciate any
>>> suggestions as to whether or not to provide that capability.
>>> Note: we removed the concept of the ORTE "universe", so a tool can
>>> now talk
>>> to any mpirun without complications. Thus, tools can simultaneously
>>> "connect" to and monitor multiple mpiruns, if desired.
>>> 2. link against all of OMPI or ORTE, and execute a standalone
>>> program. In
>>> this mode, your tool would act as a surrogate for mpirun by directly
>>> spawning the user's application. This provides some flexibility,
>>> but it does
>>> mean that both the tool and the job -must- end together, and that
>>> the tool
>>> may need to be revised whenever OMPI/ORTE APIs are updated.
>>> 3. link against all of OMPI or ORTE, executing as a distributed set of
>>> processes. In this mode, you would execute your tool via "mpirun -
>>> pernode
>>> ./my_tool" (or whatever command is appropriate - this example would
>>> launch
>>> one tool process on every node in the allocation). If the tool
>>> processes
>>> need to communicate with each other, they can call MPI_Init or
>>> orte_init,
>>> depending upon the level of desired communication. Note that the
>>> tool job
>>> will be completely standalone from the application job and must be
>>> terminated separately.
>>> In all of these cases, it is possible for tool processes to connect
>>> (via MPI
>>> and/or ORTE-RML) to a job's processes provided that the application
>>> supports
>>> it.
>>> I can provide more details, of course, to anyone wishing them. What
>>> I would
>>> appreciate, though, is any feedback about desired commands, mode of
>>> operation, etc. that I might have missed or people would prefer be
>>> changed.
>>> This code is all in a private repository for my tmp branch, but I
>>> expect
>>> that to merge with the trunk fairly soon. I have provided a couple of
>>> example tools to illustrate the above modes of operation in that code.
>>> Thanks
>>> Ralph
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]