On 25 Oct 2010, at 17:26, Jack Bryan wrote:
> Thanks, the problem is still there.
>
> I used:
>
> Only process 0 returns. Other processes are still struck in MPI_Finalize().
>
> Any help is appreciated.
You can use the command "padb -aQ" to show you the message queues for your application, you'll need to download and install padb then simply run your job, allow it to hang and they run padb - it'll show you the message queues for each rank that it can find processes for (the ones that haven't exited). If this isn't any help run "padb -axt" for the stack traces and send the output to this list.
The web-site is in my signature or there is a new beta release out this week at http://padb.googlecode.com/files/padb-3.2-beta1.tar.gz
Ashley.
--
Ashley Pittman, Bath, UK.
Padb - A parallel job inspection tool for cluster computing
http://padb.pittman.org.uk
|