Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Chris Gottbrath (chrisg_at_[hidden])
Date: 2005-10-20 14:00:42


Konstantin, Jeff,

> TotalView interface goes well, except that its details are hardcoded in
> the source of orte/tools/orterun (as you may guess I don't have the
> executable named "totalview", etc.). I'd like to know when and where do
> the functions from orterun/totalview.{h,c} get called, do I need to write

If the name 'totalview' is hardcoded in the startup executable then
that is something that we would be happy to see be made more
flexible.

What we would like to see is an environemnt variable (MPICH has
one called 'TOTALVIEW' and I'd love to see that name but
OPENMPI_DEBUGGER or something else debugger-neutral would
be fine as well) that one can set with the name of the executable
that you would like to have exectuted. This would help us in our internal
testing and could potentially help customers who might have multiple
versions of totalview installed.

In our regression testing we want to be able to use the command line
interface version of TotalView. Arguably customers might want to
be able to run TV with some arguments of its own or point to
a specific version without altering their PATH environment variable.

Cheers,
Chris

--
Chris Gottbrath
Partner Technologies Engineer    Etnus, LLC
Chris.Gottbrath_at_[hidden]        http://www.etnus.com/
Voice: 508-652-7700 x7735        Fax: 508-652-7787
On Thu, 20 Oct 2005, Konstantin Karganov wrote:
> 
> > However, we're quite open to other approaches.  Because of the nature of
> > our integration with a variety of different run-time environments, our
> > startup is not a shell script -- mpirun ("orterun" is its real name;
> > "mpirun" is a sym link to orterun) is a compiled executable.
> Surely, I saw that mpirun is the orterun executable :)
> And this means that to add some features I need to rebuild it (and some 
> run-time libs probably) each time. 
>  
> > What are the requirements of your debugger?  Do you attempt to launch
> > the MPI processes yourself, or do you attach to them after they are
> > launched (which is what TotalView does)?
> It is supposed to attach GDB to each process after it has launched, so the 
> TotalView interface goes well, except that its details are hardcoded in 
> the source of orte/tools/orterun (as you may guess I don't have the 
> executable named "totalview", etc.). I'd like to know when and where do 
> the functions from orterun/totalview.{h,c} get called, do I need to write 
> my own file like this, etc. In other words, "the debugger adder reference 
> manual" :)
> 
> Currently I launch gdb's on remote processes via ssh (as MPICH does), but 
> probably it will be better to use orte framework capabilities for this. 
> Don't know yet how.
> 
> In general, are there an ompi/orte architecture description docs, other 
> than short schemes in your publications? It's too general there and too 
> detailed in sources and doxygen docs. Some intermediate "how all this 
> works together" doc is needed to assemble the whole picture...
> For me, I do not understand it completely.
> 
> > Open MPI uses orterun as its launcher, not the first MPI process.
> > Hence, it is the one that TotalView gets it information from (in that
> > sense, it's similar to the MPICH model -- there is one coordinator; it's
> > just that it's orterun, not the first MPI process).  Once orterun
> > receives notification that all the MPI processes have started, it gives
> > the nodename/PID information of each process to TotalView who then
> > launches its own debugger processes on those nodes and attaches to the
> > processes.  
> Hm.. with MPICH I use the first gdb copy to get the info from the 0-th 
> process and then continue to use it as a node debugger, here I'll have to 
> use one more gdb to get the process table out of orterun process? And how 
> to do this in a safe way?
> 
> > You probably get a "stopped" message when you try to bg orterun because
> > the shell thinks that it is waiting for input from stdin, because we
> > didn't close it.
> Actually this shouldn't matter. Many programs don't close stdin but 
> nothing prevents them from running in background until they try to 
> read input. The same "Hello world" application runs well with MPICH 
> "mpirun -np 3 a.out &"
>  
> Best regards,
> Konstantin.
> 
> 
> 
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
>