Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Datasize confusion in MPI_Write can lead to data los!
From: George Bosilca (bosilca_at_[hidden])
Date: 2008-02-08 15:37:47

The patch I send few minutes ago will only remove the problem for Open
MPI. However, their generic test for contiguous data types is still
broken. Only checking for COMBINER_NAMED is clearly not enough. A
second test checking that the size and the extent of the data types
are equal will make the check a lot more accurate.


On Feb 8, 2008, at 12:26 PM, Rainer Keller wrote:

> Hi George,
> Good, if You come to the same conclusion with regard to romio using
> MPI_Type_size internally in RomIO...
> So taking iscontig.c ,-]
> /* This function needs more work. It should check for contiguity
> in other cases as well.*/
> and mail to the romio list or have a specialized version of
> ADIOI_Datatype_iscontig for ompi ,-]
> Either way, the mpi_test_suite in that regard is sane.
> Thanks,
> Rainer
> On Friday 08 February 2008 18:22, George Bosilca wrote:
>> MPI_Type_size is supposed to return only the size of useful data,
>> which apparently it does (MPI_SHORT_INT is 6 bytes). What I think it
>> happens is that the MPI_SHORT_INT type is a predefined one, but
>> it's a
>> really strange predefined type. It's one of the few that are not
>> contiguous. The problem seems to come from the fact that the
>> MPI_File_write do a contiguous write for the predefined data types,
>> making the assumption that they are all contiguous.
>> I tracked the problem down in the romio/adio/common/is_contig.c file.
>> For Open MPI the last #else branch is used. The first case in the
>> switch check for the MPI_COMBINER_NAMED (which is what an MPI is
>> supposed to return for predefined data types) and set the flag to 1
>> (which means contiguous). This is obviously wrong for MPI_SHORT_INT.
>> It really look like a ROMIO problem, so I guess this email should be
>> redirected to their mailing list.
>> Thanks,
>> george.
>> On Feb 8, 2008, at 12:50 PM, Christoph Niethammer wrote:
>>> Hello!
>>> I tested openMPI at HLRS for some time without detecting new
>>> problems in the
>>> implementation but now I recognized some awful ones with MPI_Write
>>> which can
>>> lead to data los:
>>> When creating a struct for a mixed datatype like
>>> struct {
>>> short a;
>>> int b;
>>> }
>>> the C-compiler introduce a gap of 2 bytes in the data representation
>>> for this
>>> type due to the 4byte alignment of the integer on 32bit systems.
>>> If I now try to use MPI_File_write to write these data to a file and
>>> use
>>> MPI_SHORT_INT as mpi_datatype this leads to a data los.
>>> I located the problem at the combined use of "write" and
>>> MPI_Type_size in
>>> MPI_File_write.
>>> So MPI_Type_size(MPI_SHORT_INT) returns 6 bytes where the struct
>>> uses 8 bytes
>>> in memory as there is a gap of 2 bytes. The write function in
>>> ad_write.c now
>>> leads to the los of the data because the gaps are not within the
>>> calculation
>>> of the complete data size to be written into the file.
>>> This problem occures also in the other io functions.
>>> As far as I could find out the problem seems not to be present with
>>> derived
>>> data types.
>>> The question is now how to "fix":
>>> i) Either the MPI_Standard is not clear in this point and the data
>>> types
>>> MPI_SHORT_INT, MPI_DOUBLE_INT, ... should be forbidden to be used
>>> with
>>> structs of these types,
>>> ii) Or the implementation of the MPI_Type_size function has to be
>>> modified to
>>> return the value of eg. true_ub which contains the correct value
>>> iii) Or the MPI_File_write function has not to use the write
>>> function in
>>> the "continues" way on the data and should take care of the gaps.
>>> Regards
>>> Christoph Niethammer
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
> --
> ----------------------------------------------------------------
> Dipl.-Inf. Rainer Keller
> HLRS Tel: ++49 (0)711-685 6 5858
> Nobelstrasse 19 Fax: ++49 (0)711-685 6 5832
> 70550 Stuttgart email: keller_at_[hidden]
> Germany AIM/Skype:rusraink

  • application/pkcs7-signature attachment: smime.p7s