Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Trapping fortran I/O errors leavingzombiempiprocesses
From: Laurence Marks (L-marks_at_[hidden])
Date: 2010-01-29 09:13:21


OK, but trivial codes don't always reproduce problems.

Is strace useful?

On Fri, Jan 29, 2010 at 7:32 AM, Jeff Squyres <jsquyres_at_[hidden]> wrote:
> On Jan 29, 2010, at 8:23 AM, Laurence Marks wrote:
>
>> I'll try, but sometimes these things are hard to reproduce and I have
>> to wait for free nodes to do the test.
>
> Understood.
>
>> If I do manage to reproduce the
>> issue (I've added ERR= traps, so would have to regress) any thing else
>> to look at?
>
> You might want to write up a trivial fortran example outside of your main app -- a 10-20 line app that explicitly reads past the end of a trivial file in one MPI process while all the other processes are waiting in an MPI_Barrier, or somesuch.  That way you could test this easily even on 1 node, and not have to regress your source, etc.
>
> I think counting the processes should be sufficient.  But with a small/trivial test like described above, you might even want to put in some extra print* statements, just to verify exactly where the process stopped, whether it actually exited, etc.
>
> --
> Jeff Squyres
> jsquyres_at_[hidden]
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

-- 
Laurence Marks
Department of Materials Science and Engineering
MSE Rm 2036 Cook Hall
2220 N Campus Drive
Northwestern University
Evanston, IL 60208, USA
Tel: (847) 491-3996 Fax: (847) 491-7820
email: L-marks at northwestern dot edu
Web: www.numis.northwestern.edu
Chair, Commission on Electron Crystallography of IUCR
www.numis.northwestern.edu/
Electron crystallography is the branch of science that uses electron
scattering and imaging to study the structure of matter.