Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] debugger confusion
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2011-11-08 07:52:23


On Nov 7, 2011, at 8:34 PM, Ralph Castain wrote:

> Best guess: from what I've seen, most debuggers don't seem to conform to what the MPI Forum has "accepted". It doesn't appear that the vendors and debugger developers pay too much attention to that document, possibly because it (a) came after the debuggers were developed, and (b) still doesn't seem to be widely adopted.

Keep in mind that the debugger/tool authors essentially wrote the document, with some guidance from the Forum. The Forum saw the wisdom in making it an "official" MPI Forum document so that it would carry some weight, and voted to do so. That document is not actually part of any MPI standard document for multiple reasons; here's two:

1. MPIR has a bunch of known problems which no one is currently interested in fixing (e.g., scalability)
2. No one wanted to *mandate* the MPIR interface in an MPI implementation

It is therefore a standalone document that, since it became an "official" Forum document, is available on mpi-forum.org:

    http://www.mpi-forum.org/docs/mpir-specification-10-11-2010.pdf

To be clear: that document simply standardizes what MPI implementations are supposed to provide in their MPIR implementation (prior to this, MPI implementations tended to have subtle differences between their MPIR implementations, which were a nightmare for the debugger/tool vendors). This document does *not* fix the scalability and other well-known issues with MPIR -- it just consolidates and standardizes the slightly-different versions of MPIR that were floating around out there.

> I'd suggest being a little careful about making changes without consulting people who use TV and "stat", at least - those are the ones most recently tested.

Fair enough.

Moving towards what was specified in that document would probably be a good thing, though, since that document *is* the currently accepted version of how MPIR is supposed to work and was essentially written *by* the tool vendors. Of course, appropriate testing with various debuggers and tools out there should be a given -- current versions of DDT, Totalview, and padb are probably the 3 most obvious ones with which to test; others have mentioned some "stat," too.

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/