Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] MPI_File_write_ordered does not truncate files
From: Robert Latham (robl_at_[hidden])
Date: 2009-02-18 15:53:13

On Wed, Feb 18, 2009 at 11:10:51AM -0800, Brian Austin wrote:
> >> Can you confirm - are you -really- using 1.1.2???
> >>
> >> You might consider updating to something more recent, like 1.3.0 or
> >>at least 1.2.8. It would be interesting to know if you see the same
> >> problem.
> > Also, if you could include a short program that reproduces the
> > problem, that would be most helpful.
> Hi,
> thanks for your replies.
> It's true, I was using 1.1.2.
> I just switched to 1.3 and I see the same behavior.

The Ordered-mode routines haven't changed in years and years, but for
a host of other reasons it's probably good you're working with 1.3

> Here's a sample program.

thanks for sending this along. it makes the problem quite clear --
what you are seeing is exactly the behavior described by the MPI standard.

> //write long file aa
> MPI_File_open( MPI_COMM_WORLD, "foo.txt",
> MPI_INFO_NULL, &fh );
> MPI_File_write_ordered( fh, a2_buff, 2, MPI_BYTE, &status );
> MPI_File_close( &fh );
> //foo.txt now says "aa"

as you are seeing, this is all as it should be -- but you haven't done
anything tricky yet so of course it should all be fine.

> //write short file b
> MPI_File_open( MPI_COMM_WORLD, "foo.txt",
> MPI_INFO_NULL, &fh );
> MPI_File_write_ordered( fh, b1_buff, 1, MPI_BYTE, &status );
> MPI_File_close( &fh );
> //foo.txt now says "ba"
> //but I expect it to say "b"

now we get to a tricker thing.

When you open a file the "default file
view" is and initial location of the implicit and shared file pointers
mean MPI did exactly what you asked of it: write "b" to the 0th byte
in the file.

        Initially, all processes view the file as a linear byte
        stream, and each process views data in its own native
        representation (no data representation conversion is
        performed). (POSIX files are linear byte streams in the native
        representation.) The file view can be changed via the
        MPI_FILE_SET_VIEW routine.

I think you might want MPI_MODE_APPEND, but be warned

        Specifying MPI_MODE_APPEND only guarantees that all shared and
        individual file pointers are positioned at the initial end of
        file when MPI_FILE_OPEN returns. Subsequent positioning of
        file pointers is application dependent. In particular, the
        implementation does not ensure that all writes are appended.

if you did not close the file between iterations, you'd get what you
expected, but the moment you re-opened the file, the shared file
pointer reset to 0.

Now if I may provide a word of caution: please think extra-hard if
you want to use shared file pointers. They are implemented for
correctness, but not for performance. You will likely see much better
performance if you use MPI_EXSCAN to compute every MPI process's offset
into the file (I presume each process is writing a variable number of
bytes, or you wouldn't consider ordered mode in the first place,
right?) and then do an MPI_FILE_WRITE_AT_ALL to carry out the
I/O collectively.

Follow up if that didn't make any sense to you. I can provide
examples if need be.


Rob Latham
Mathematics and Computer Science Division    A215 0178 EA2D B059 8CDF
Argonne National Lab, IL USA                 B29D F333 664A 4280 315B