On Wed, 9 Jul 2008, Ralph Castain wrote:
> stdin is read twice if rank=0 shares the node with mpirun
I consider this to be a very serious regression. Many Fortran
scientific programs (at least many that I know) read their input from
stdin. This comes as a result of them being (or started to be) written
many years ago with Fortran77 for which AFAIK there is no defined way
of handling command line parameters, so reading from stdin is a
convenient and portable way to put some data into the program as this
is known to be open already and at a well known I/O unit.
I just spent 2 days trying to understand why one such program (CHARMM)
which worked fine for many MPI implementations on many platforms
including the stable 1.2 series on this very cluster suddenly stops in
some step related to processing input. After reading your message,
everything makes sense...
> Alternatively, we could ship 1.3 as-is, and warn users (similar to 1.2) that
> they should avoiding reading from stdin if there is any chance that rank=0
> could be co-located with mpirun. Note that most of our clusters do not allow
> such co-location - but it is permitted by default by OMPI.
I don't know what setup your clusters have, but most that I have seen,
including all those that I admin, do run mpirun/mpiexec and rank=0 on
the same node. I really think that this will bite a lot of people.
IWR, University of Heidelberg, INF 368, D-69120 Heidelberg, Germany
Phone: +49 6221 54 8869/8240, Fax: +49 6221 54 8868/8850