Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] File seeking with shared filepointer issues
From: Rob Latham (robl_at_[hidden])
Date: 2011-07-05 11:07:30

On Mon, Jun 27, 2011 at 03:20:36PM +0200, pascal.deveze_at_[hidden] wrote:
> Christian,
> Suppose you have N processes calling the first MPI_File_get_position_shared
> ().
> Some of them are running faster and could execute the call to
> MPI_File_seek_shared() before all the other have got their file position.
> (Note that the "collective" primitive is not a synchronization. In that
> case, all parameters are broadcast to the process 0 and checked by process
> 0. All
> the other processes are not blocked).
> So the slow processes can get the file position that has just been
> modified by the faster.
> That is the reason why, in your program, It is necessary to synchronize all
> processes just before the call to MPI_File_seek_shared().

There's this tool "Jumpshot" that's fun to use but does have a bit of
a learning curve: it just presents so much data it can be hard to make
sense of it.

Still, I like use jumpshot and this seemed like a good chance to
demonstrate Pascal's point about timings:

I've attached a jumpshot trace of an 8 processor run of Christian's
test case.
- I've built ROMIO to record not only the MPI-IO calls but the underlying posix i/o calls as well.
- Then, I enabled display of just the shared file pointer operations
  and the posix calls. Sorry if anyone is color blind.

  color / call

  purple / MPI_File_get_position_shared
  pink / MPI_File_seek_shared
  orange / posix open
  green / posix close
  blue / posix write

The attached trace shows
- rank 0 exiting MPI_File_get_position_shared relatively quickly,
- rank 0 enters MPI_File_seek_shared before anyone else.
- The blue bar is where rank 0 writes the new value of the shared
file pointer,
- Rank 0 did so before any other process read the value of the shared
  file pointer (the green bar)

Anyway, this is all known behavior. collecting the traces seemed like
a fun way to spend the last hour on friday before the long (USA)
weekend :>


Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA