Thanks for the analysis. I have used your suggestions, but am still
frustrated by what I am seeing. I too have run my tests on single node
systems, and here is what I have done:
1. I modified the 'writeb' script to essentially mimic the example in
section 7.9.3 of Vol 2 of MPI: The Complete Reference. It writes a
100x100 matrix to a file with 4 processes. The script is attached.
2. When running this problem on my desktop systems, I continue to get IO
errors from 'MPI_FILE_WRITE_ALL' no matter how I distribute the
3. I ported the script to a cluster running PBS. The nodes on this
system have 4 processors, so the problem is still on one node. The
environment is very similar to my desktop, except it is Openmpi 1.4.2
instead of 1.4.3. These are the results:
subsizes(1)=25, subsizes(2)=100 -- file written correctly
subsizes(1)=50, subsizes(2)=50 -- file written correctly
subsizes(1)=100, subsizes(2)=25 -- file written incorrectly
On this last case, it appears that only one of the processes does a
write. Ironically, this case uses the same process distribution
described in the example above.
Am I still missing something in the setup of the problem, or is there
something about the OpenMpi configuration on all of these systems that
is incorrect for MPI-IO. Although they haven't been doing any MPI-IO to
my knowledge, all of these systems have been successfully running large
MPI applications for 2-3 years.
On Fri, 2010-12-17 at 15:47 -0600, Rob Latham wrote:
> On Wed, Dec 15, 2010 at 01:21:35PM -0800, Tom Rosmond wrote:
> > I want to implement an MPI-IO solution for some of the IO in a large
> > atmospheric data assimilation system. Years ago I got some small
> > demonstration Fortran programs ( I think from Bill Gropp) that seem to
> > be good candidate prototypes for what I need. Two of them are attached
> > as part of simple shell script wrappers (writea, writeb). Both programs
> > are doing equivalent things to write a small test file, but using
> > different MPI functions. Specifically, 'writea' does multiple writes
> > into the file using the 'MPI_FILE_SEEK', while 'writeb' does one write
> > call using 'MPI_TYPE_CREATE_SUBARRAY', and 'MPI_FILE_SET_VIEW'. My
> > problem is that while 'writea' works correctly, ' writeb' fails with an
> > IO_ERROR error code returned from the final 'MPI_FILE_WRITE' call. I
> > have look at the code carefully and studied the MPI standard for the
> > functions used, and can't see what is wrong with the failing call, but I
> > must be missing something. I seem to remember the program running
> > correctly years ago, but that was on another platform and MPI
> > environment.
> My test environment isn't that different from yours, though I am
> running on a single node (laptop). Both MPICH2-1.3.1 and
> OpenMPI-1.4 pass the test.
> Some observations:
> - writeb leaks a datatype (you do not free the subarray type)
> - in writea you don't really need to seek and then write. You could
> call MPI_FILE_WRITE_AT_ALL.
> - You use collective I/O in writea (good for you!) but use independent
> I/O in writeb. Especially for a 2d subarray, you'll likely see
> better performance with MPI_FILE_WRITE_ALL.