On 15 Mar 2010, at 20:18, Brock Palen wrote:
> Is there a way to view what outstanding messages are in queues for an already running job? I know I can do this via ddt (parallel debugger) but for normal non debugged jobs is there a way to just ask open-mpi "show outstanding messages you have"?
This is one of the bits of information Padb can tell you, as well as lots of other detail about running jobs, the message queue data isn't as concise as it could be when looking at large process counts but the data is there.
> Thanks, this would be really useful for jobs that only hang randomly or after very long runtimes.
You're right, for example it's used to good effect in the open-mpi automated testing as well as at numerous other sites from the large to the small.
Ashley Pittman, Bath, UK.
Padb - A parallel job inspection tool for cluster computing