On Wed, May 22, 2013 at 12:23:36PM -0400, Eric Chamberland wrote:
> On 05/22/2013 11:33 AM, Tom Rosmond wrote:
> >Thanks for the confirmation of the MPIIO problem. Interestingly, we
> >have the same problem when using MPIIO in INTEL MPI. So something
> >fundamental seems to be wrong.
> I think but I am not sure that it is because the MPI I/O (ROMIO)
> code is the same for all distributions...
> It has been written by Rob Latham.
Hello! Rajeev wrote it when he was in grad school, then he passed the
torch to Rob Ross when he was a post-doc at Argonne, and now I've been
the caretaker for the last mumble-mumble years. (now if i could only
find some other sucker....)
Tom, Eric: I have recently fixed this bug for some cases. I don't
know when OpenMPI will re-sync with ROMIO (it's getting harder and
harder to keep ROMIO as "the standalone MPI-IO implementation") but it
should be straightforward to pick up that change
(it's this one:
The functional descriptions for ROMIO are indeed "integer count of
some datatype", but one can still use that to say "write a billion
ROMIO handles this internally with as many calls to the write(2)
system call as it takes to complete.
If you try to get fancy and make a struct of three thousand
megabyte-sized MPI_CONTIG types, MPICH will blow up. I haven't tested
But basic types should be ok, now.
Mathematics and Computer Science Division
Argonne National Lab, IL USA