Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Problem with MPI_Type_indexed and hole (defined with MPI_Type_create_resized )
From: Pascal Deveze (Pascal.Deveze_at_[hidden])
Date: 2010-03-19 06:14:42


Hi George,

I went further on my investigations, and I found a solution.

ADIOI_Datatype_iscontig is defined in the file
ompi/mca/io/romio/src/io_romio_module.c as:

void ADIOI_Datatype_iscontig(MPI_Datatype datatype, int *flag)
{
    /*
     * Open MPI contiguous check return true for datatype with
     * gaps in the beginning and at the end. We have to provide
     * a count of 2 in order to get these gaps taken into acount.
     */
    *flag = ompi_datatype_is_contiguous_memory_layout(datatype, 2);
}

It is clearly written here that the gaps should be taken into account
with a count of 2. But that's not everytime the case.

Your proposition is to modify ROMIO code.
So, I propose to fix ADIOI_Datatype_iscontig and add the following code
after the call
to ompi_datatype_is_contiguous_memory_layout():

    if (*flag) {
        MPI_Aint true_extent, true_lb;

        ompi_datatype_get_true_extent(datatype, &true_lb, &true_extent);

        if (true_lb > 0)
            *flag = 0;
    }

Regards,

Pascal

On Mar 18, 2010, at 13:24, George Bosilca wrote:
> We will disagree on that, but your datatype is contiguous. It doesn't
> matter that there are gaps in the beginning and at the end, as long as
> you only send one such datatype the real data that has to go over the
> network _is_ contiguous. And this is what the Open MPI datatype engine
> is reporting back.
>
> Apparently, ROMIO expect a contiguous datatype to start from the
> position 0 relative to the beginning of the user buffer. I don't see
> why they have such a restrictive view, but I guess the original MPICH
> datatype engine was not able to distinguish between gaps in the middle
> and gaps at the beginning and the end of the datatype.
>
> I don't see how to fix that in ROMIO code. But in case you plan to fix
> it, the correct solution is to retrieve the true lower bound of the
> datatype in the contiguous case and add it to st_offset.
>
> george.
>
> On Mar 18, 2010, at 12:27 , Pascal Deveze wrote:
>
>> Hi all,
>>
>> Sorry, I missed my porting from MPICH2 to OpenMPI concerning the file
> romio/adio/comm/flatten.c
>> (flatten.c in OpenMPI does not support MPI_COMBINER_RESIZED).
>>
>> Here is the diff:
>>
>> diff -u flatten.c flatten.c.old
>> --- flatten.c 2010-03-18 17:07:43.000000000 +0100
>> +++ flatten.c.old 2010-03-18 17:14:04.000000000 +0100
>> @@ -525,44 +525,6 @@
>> }
>> break;
>> - case MPI_COMBINER_RESIZED:
>> - /* This is done similar to a type_struct with an lb, datatype, ub */
>> -
>> - /* handle the Lb */
>> - j = *curr_index;
>> - flat->indices[j] = st_offset + adds[0];
>> - flat->blocklens[j] = 0;
>> -
>> - (*curr_index)++;
>> -
>> - /* handle the datatype */
>> -
>> - MPI_Type_get_envelope(types[0], &old_nints, &old_nadds,
>> - &old_ntypes, &old_combiner);
>> - ADIOI_Datatype_iscontig(types[0], &old_is_contig);
>> -
>> - if ((old_combiner != MPI_COMBINER_NAMED) && (!old_is_contig)) {
>> - ADIOI_Flatten(types[0], flat, st_offset+adds[0], curr_index);
>> - }
>> - else {
>> - /* current type is basic or contiguous */
>> - j = *curr_index;
>> - flat->indices[j] = st_offset;
>> - MPI_Type_size(types[0], (int*)&old_size);
>> - flat->blocklens[j] = old_size;
>> -
>> - (*curr_index)++;
>> - }
>> -
>> - /* take care of the extent as a UB */
>> - j = *curr_index;
>> - flat->indices[j] = st_offset + adds[0] + adds[1];
>> - flat->blocklens[j] = 0;
>> -
>> - (*curr_index)++;
>> -
>> - break;
>> -
>> default:
>> /* TODO: FIXME (requires changing prototypes to return errors...) */
>> FPRINTF(stderr, "Error: Unsupported datatype passed to
>> ADIOI_Flatten\n");
>> @@ -827,29 +789,6 @@
>> }
>> }
>> break;
>> -
>> - case MPI_COMBINER_RESIZED:
>> - /* treat it as a struct with lb, type, ub */
>> -
>> - /* add 2 for lb and ub */
>> - (*curr_index) += 2;
>> - count += 2;
>> -
>> - /* add for datatype */
>> - MPI_Type_get_envelope(types[0], &old_nints, &old_nadds,
>> - &old_ntypes, &old_combiner);
>> - ADIOI_Datatype_iscontig(types[0], &old_is_contig);
>> -
>> - if ((old_combiner != MPI_COMBINER_NAMED) && (!old_is_contig)) {
>> - count += ADIOI_Count_contiguous_blocks(types[0], curr_index);
>> - }
>> - else {
>> - /* basic or contiguous type */
>> - count++;
>> - (*curr_index)++;
>> - }
>> - break;
>> -
>> default:
>> /* TODO: FIXME */
>> FPRINTF(stderr, "Error: Unsupported datatype passed to
> ADIOI_Count_contiguous_blocks, combiner = %d\n", combiner);
>>
>>
>> Regards,
>>
>> Pascal
>>
>> Pascal Deveze a écrit :
>> > Hi all,
>> >
>> > I use a very simple datatype defined as follow:
>> > lng[0]= 1;
>> > dsp[0]= 1;
>> > err=MPI_Type_indexed(1, lng, dsp, MPI_CHAR, &offtype);
>> > err=MPI_Type_create_resized(offtype, 0, 2, &filetype);
>> > MPI_Type_commit(&filetype);
>> >
>> > This datatype consists of a hole (of length 1 char) followed by a
>> char.
>> >
>> > The datatype with hole at the beginning is not correctly handled by
> ROMIO integrated in OpenMPI (I tried with MPICH2 and it worked fine).
>> > You will see bellow a program to reproduce the problem.
>> >
>> > After investigations, I see that the difference between OpenMPI and
> MPICH appears at line 542 in the file romio/adio/comm/flatten.c:
>> >
>> > case MPI_COMBINER_RESIZED:
>> > /* This is done similar to a type_struct with an lb, datatype, ub */
>> >
>> > /* handle the Lb */
>> > j = *curr_index;
>> > flat->indices[j] = st_offset + adds[0];
>> > flat->blocklens[j] = 0;
>> >
>> > (*curr_index)++;
>> >
>> > /* handle the datatype */
>> >
>> > MPI_Type_get_envelope(types[0], &old_nints, &old_nadds,
>> > &old_ntypes, &old_combiner);
>> > ADIOI_Datatype_iscontig(types[0], &old_is_contig); <==========
>> ligne 542
>> >
>> > For MPICH2, the datatype is not contiguous, but it is for OpenMPI.
> The routine ADIOI_Datatype_iscontig is
>> > quite different in OpenMPI because the datatypes are handled very
> differently. If I reset old_is_contig just after
>> > line 542, the problem disappears (Of course, this is not a solution).
>> >
>> > I am not able to propose a right solution. Can somebody help ?
>> >
>> > Pascal
>> >
>> > ============ Program to reproduce the problem ========
>> > #include <stdio.h>
>> > #include "mpi.h"
>> >
>> > char filename[256]="VIEW_TEST";
>> > char buffer[100];
>> > int err, i, myid, dsp[3], lng[3];
>> > MPI_Status status;
>> > MPI_File fh;
>> > MPI_Datatype filetype, offtype;
>> > MPI_Aint lb, extent;
>> >
>> > int main(int argc, char **argv) {
>> >
>> > MPI_Init(&argc, &argv);
>> > MPI_Comm_rank(MPI_COMM_WORLD, &myid);
>> > for (i=0; i<sizeof(buffer); i++) buffer[i] = i;
>> >
>> > if (!myid) {
>> > MPI_File_open(MPI_COMM_SELF, filename, MPI_MODE_CREATE |
> MPI_MODE_RDWR , MPI_INFO_NULL, &fh);
>> > MPI_File_write(fh, buffer, sizeof(buffer), MPI_CHAR, &status);
>> > MPI_File_close(&fh);
>> >
>> > lng[0]= 1;
>> > dsp[0]= 1;
>> > MPI_Type_indexed(1, lng, dsp, MPI_CHAR, &offtype);
>> > MPI_Type_create_resized(offtype, 0, 2, &filetype);
>> > MPI_Type_commit(&filetype);
>> >
>> > MPI_File_open(MPI_COMM_SELF, filename, MPI_MODE_RDONLY ,
> MPI_INFO_NULL, &fh);
>> > MPI_File_set_view(fh, 0, MPI_CHAR, filetype,"native", MPI_INFO_NULL);
>> > MPI_File_read(fh, buffer, 5, MPI_CHAR, &status);
>> >
>> > printf("Data: ");
>> > for (i=0 ; i<5 ; i++) printf(" %x ", buffer[i]);
>> > if (buffer[1] != 3) printf("\n =======> test KO : buffer[1]=%d
> instead of %d \n", buffer[1], 4);
>> > else printf("\n =======> test OK\n");
>> > MPI_Type_free(&filetype);
>> > MPI_File_close(&fh);
>> > }
>> > MPI_Barrier(MPI_COMM_WORLD);
>> > MPI_Finalize();
>> > }
>> > ============ The result of the program with MPICH2 ========
>> > Data: 1 3 5 7 9
>> > =======> test OK
>> >
>> > ============ The result of the program with OpenMPI ========
>> > Data: 0 2 4 6 8
>> > =======> test KO : buffer[1]=2 instead of 4
>> >
>> > Comment: Only the first hole is ommited.
>> >
>> >
>> >
>> >
>>
>
>