Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: [OMPI devel] mpirun Produces Extraneous Output
From: Greg Thomsen (gthomsen_at_[hidden])
Date: 2014-06-10 14:36:36


All,

I believe I've found a bug in the I/O forwarding portion of OpenMPI
which occasionally causes mpirun to generate additional data on standard
output that was not produced by the application being run.

The application in question reads from standard input and writes to
standard output only on the rank 0 process. All non-rank 0 processes
only participate in computation and do not produce data on standard
output. The application is used in standard Unix-like pipelines like so:

    A | mpirun -np 4 application | B

Since B is looking for structured input, it is sensitive to additional
data being generated.

While chasing down the source of this problem, I've observed the following:

* The problem is sensitive to timing. Using strace to figure out where
   the problem lies can easily hide it. Either of the following would
   change how the issue was expressed:

     A | mpirun -np 4 strace -o output.txt -e read,write application | B
     A | strace -ff -o output.txt -e read,write mpirun -np 4 application
     | B

   While harder to state definitively, redirecting in from file and out
   to file, rather than through pipes, also appears to hide the problem.
   Since the workflow in question has large volumes of data, using
   file-based I/O isn't feasible and wasn't thoroughly explored during
   testing.

* It appears to be correlated to a short I/O operation. A short read
   from application's standard output maps to the first byte of the
   extraneous output sent to B. Looking at hex dumps indicates that the
   contents of a recent buffer are inadvertently written to B.

   The attached text case can show this.

* This also is an issue for forwarding standard error from the rank 0
   process. Modifying application so that it only writes to standard
   error, and then redirecting standard error to standard output in the
   shell, will still cause the problem:

     A | mpirun -np 4 application 2>&1 | B

* This seems to only occur at the end of the data stream. The pipeline
   in question works through records and if it occurred earlier than the
   last, it would be noticed. The conditions where it was seen
   regularly always pointed to the end of the data stream.

The attached shell script reproduces the problem in every version of
OpenMPI tested (1.5.0, 1.6.3, 1.7.4, and 1.8.1). Without any arguments,
it reads a fixed amount of data from /dev/zero and then compares the
size of the output from the above pipeline. For versions exhibiting the
bug, and the above conditions, the problem should be seen within the
first ~20 attempts. For other conditions I've seen the script run for a
week without problem.

With an input path, it reads the first 1,000,000 bytes from the path
supplied. With a fixed pattern in the data (see the compressed test
input), it is easy to see that the extra data generated is a copy from
earlier in the data stream.

Hopefully this gets someone in the right section of the code. Let me
know if additional information is needed.

Thanks!

Greg