Open MPI logo

MTT Devel Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all MTT Devel mailing list

Subject: Re: [MTT devel] Analysis of hung jobs.
From: Ethan Mallove (ethan.mallove_at_[hidden])
Date: 2009-10-07 16:21:39

On Wed, Oct/07/2009 09:04:22PM, Ashley Pittman wrote:
> On Wed, 2009-10-07 at 15:41 -0400, Ethan Mallove wrote:
> > I got the following error doing a simple test:
> As it happens I saw this error earlier on FC8, r279 should fix this
> problem.

Thanks. That eliminates the perl regex error.

> > $ perl --version
> > This is perl, v5.8.4 built for sun4-solaris-64int
> I had wondered if you'd be using solaris, this is not something I've
> tested and not something I'd expect to work. The stack trace code
> should all be fine but there might be some problems reading data
> from /proc. In the past padb has worked on Tru64, possibly all that
> needs porting would be getting parent pid and process name from ps
> rather than /proc/status.

Okay. I've moved to Linux for testing:

  $ padb --debug=all --verbose --config-option rmgr=mpirun --full-report=29713
  Loading config from "/etc/padb.conf"
  Loading config from "/home/em162155/.padbrc"
  Loading config from environment
  Loading config from command line
  Setting 'rmgr' to 'mpirun'
  DEBUG (config): 0: Finished setting configuration options
  padb version 3.n (Revision 279)
  full job report for job 29713

  No secret file (/home/em162155/.padb-secret)
  Error: Could not load secret file on this node


> Ashley,
> --
> Ashley Pittman, Bath, UK.
> Padb - A parallel job inspection tool for cluster computing
> _______________________________________________
> mtt-devel mailing list
> mtt-devel_at_[hidden]