Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Trapping fortran I/O errors leavingzombiempiprocesses
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2010-01-29 08:32:20


On Jan 29, 2010, at 8:23 AM, Laurence Marks wrote:

> I'll try, but sometimes these things are hard to reproduce and I have
> to wait for free nodes to do the test.

Understood.

> If I do manage to reproduce the
> issue (I've added ERR= traps, so would have to regress) any thing else
> to look at?

You might want to write up a trivial fortran example outside of your main app -- a 10-20 line app that explicitly reads past the end of a trivial file in one MPI process while all the other processes are waiting in an MPI_Barrier, or somesuch. That way you could test this easily even on 1 node, and not have to regress your source, etc.

I think counting the processes should be sufficient. But with a small/trivial test like described above, you might even want to put in some extra print* statements, just to verify exactly where the process stopped, whether it actually exited, etc.

-- 
Jeff Squyres
jsquyres_at_[hidden]