Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] torque pbs behaviour...
From: Ralph Castain (rhc_at_[hidden])
Date: 2009-08-11 07:36:17


On Aug 11, 2009, at 5:17 AM, Ashley Pittman wrote:

> On Tue, 2009-08-11 at 03:03 -0600, Ralph Castain wrote:
>> If it isn't already there, try putting a print statement tight at
>> program start, another just prior to MPI_Init, and another just after
>> MPI_Init. It could be that something is hanging somewhere during
>> program startup since it sounds like everything is launching just
>> fine.
>
> If you suspect a hang then you can use the command orte-ps (on the
> node
> where the mpirun is running) and it should show you your job. This
> will
> tell you if the job is started and still running or if there was a
> problem launching.
>
> If the program did start and has really hung then you can get more
> in-depth information about it using padb which is linked to in my
> signature.

FWIW: we use padb for this purpose, and it is very helpful!

Ralph

>
> Ashley,
>
> --
>
> Ashley Pittman, Bath, UK.
>
> Padb - A parallel job inspection tool for cluster computing
> http://padb.pittman.org.uk
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users