Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Onesided + derived datatypes
From: George Bosilca (bosilca_at_[hidden])
Date: 2008-12-13 14:58:48


No. It fixes an issue when correctly rebuilding (i.e. with the real
displacements) the data-type on the remote side, but it didn't fix the
wrong values problem.

   george.

On Dec 13, 2008, at 07:59 , Jeff Squyres wrote:

> George -- you had a commit after this (r20123) -- did that fix the
> problem?
>
>
> On Dec 12, 2008, at 8:14 PM, George Bosilca wrote:
>
>> Dorian,
>>
>> I looked into this again. So far I can confirm that the datatype is
>> correctly created, and always contain the correct values
>> (internally). If instead of one sided you use send/recv then the
>> output is exactly what you expect. With the one sided there are
>> several strange things. What I can say so far is that everything
>> works fine, except when the block indexed datatype is used as the
>> remote datatype in the MPI_Put operation. In this case the remote
>> memory is not modified.
>>
>> george.
>>
>>
>> On Dec 12, 2008, at 08:20 , Dorian Krause wrote:
>>
>>> Hi again.
>>>
>>> I adapted my testing program by overwriting the window buffer
>>> complete with 1. This allows me to see at which places OpenMPI
>>> writes.
>>> The result is:
>>>
>>> *** -DO1=1 -DV1=1 *** (displ 3,2,1,0 ,
>>> MPI_Type_create_indexed_block)
>>> mem[0] = { 0.0000, 0.0000, 0.0000}
>>> mem[1] = { 0.0000, 0.0000, 0.0000}
>>> mem[2] = { 0.0000, 0.0000, 0.0000}
>>> mem[3] = { nan, nan, nan}
>>> mem[4] = { nan, nan, nan}
>>> mem[5] = { nan, nan, nan}
>>> mem[6] = { nan, nan, nan}
>>> mem[7] = { nan, nan, nan}
>>> mem[8] = { nan, nan, nan}
>>> mem[9] = { nan, nan, nan}
>>> *** -DO1=1 -DV2=1 *** MPI_Type_contiguous(4, mpi_double3, &mpit)
>>> mem[0] = { 0.0000, 1.0000, 2.0000}
>>> mem[1] = { 3.0000, 4.0000, 5.0000}
>>> mem[2] = { 6.0000, 7.0000, 8.0000}
>>> mem[3] = { 9.0000, 10.0000, 11.0000}
>>> mem[4] = { nan, nan, nan}
>>> mem[5] = { nan, nan, nan}
>>> mem[6] = { nan, nan, nan}
>>> mem[7] = { nan, nan, nan}
>>> mem[8] = { nan, nan, nan}
>>> mem[9] = { nan, nan, nan}
>>> *** -DO2=1 -DV1=1 *** (displ 0,1,2,3 ,
>>> MPI_Type_create_indexed_block)
>>> mem[0] = { 0.0000, 0.0000, 0.0000}
>>> mem[1] = { 0.0000, 0.0000, 0.0000}
>>> mem[2] = { 0.0000, 0.0000, 0.0000}
>>> mem[3] = { 0.0000, 0.0000, 0.0000}
>>> mem[4] = { nan, nan, nan}
>>> mem[5] = { nan, nan, nan}
>>> mem[6] = { nan, nan, nan}
>>> mem[7] = { nan, nan, nan}
>>> mem[8] = { nan, nan, nan}
>>> mem[9] = { nan, nan, nan}
>>> *** -DO2=1 -DV2=1 *** MPI_Type_contiguous(4, mpi_double3, &mpit)
>>> mem[0] = { 0.0000, 1.0000, 2.0000}
>>> mem[1] = { 3.0000, 4.0000, 5.0000}
>>> mem[2] = { 6.0000, 7.0000, 8.0000}
>>> mem[3] = { 9.0000, 10.0000, 11.0000}
>>> mem[4] = { nan, nan, nan}
>>> mem[5] = { nan, nan, nan}
>>> mem[6] = { nan, nan, nan}
>>> mem[7] = { nan, nan, nan}
>>> mem[8] = { nan, nan, nan}
>>> mem[9] = { nan, nan, nan}
>>>
>>> Note that for the reversed ordering (3,2,1,0) only 3 lines are
>>> written. If I use displacements 3,2,1,8
>>> I get
>>>
>>> *** -DO1=1 -DV1=1 ***
>>> mem[0] = { 0.0000, 0.0000, 0.0000}
>>> mem[1] = { 0.0000, 0.0000, 0.0000}
>>> mem[2] = { 0.0000, 0.0000, 0.0000}
>>> mem[3] = { nan, nan, nan}
>>> mem[4] = { nan, nan, nan}
>>> mem[5] = { nan, nan, nan}
>>> mem[6] = { nan, nan, nan}
>>> mem[7] = { nan, nan, nan}
>>> mem[8] = { 0.0000, 0.0000, 0.0000}
>>> mem[9] = { nan, nan, nan}
>>>
>>> but 3,2,8,1 yields
>>>
>>> *** -DO1=1 -DV1=1 ***
>>> mem[0] = { 0.0000, 0.0000, 0.0000}
>>> mem[1] = { 0.0000, 0.0000, 0.0000}
>>> mem[2] = { 0.0000, 0.0000, 0.0000}
>>> mem[3] = { nan, nan, nan}
>>> mem[4] = { nan, nan, nan}
>>> mem[5] = { nan, nan, nan}
>>> mem[6] = { nan, nan, nan}
>>> mem[7] = { nan, nan, nan}
>>> mem[8] = { nan, nan, nan}
>>> mem[9] = { nan, nan, nan}
>>>
>>> Dorian
>>>
>>>
>>>> -----Ursprüngliche Nachricht-----
>>>> Von: "Dorian Krause" <doriankrause_at_[hidden]>
>>>> Gesendet: 12.12.08 13:49:25
>>>> An: Open MPI Users <users_at_[hidden]>
>>>> Betreff: Re: [OMPI users] Onesided + derived datatypes
>>>
>>>
>>>> Thanks George (and Brian :)).
>>>>
>>>> The MPI_Put error is gone. Did you take a look at the problem
>>>> that with the block_indexed type the PUT doesn't work? I'm
>>>> still getting the following output (V1 corresponds to the datatype
>>>> created with MPI_Type_create_indexed_block while the V2 type
>>>> is created with MPI_Type_contiguous, the ordering doesn't care
>>>> anymore after
>>>> your fix) which confuses me
>>>> because I remember that (on one machine) MPI_Put with
>>>> MPI_Type_create_indexed
>>>> worked until the invalid datatype error showed up (after a couple
>>>> of timesteps).
>>>>
>>>> *** -DO1=1 -DV1=1 ***
>>>> mem[0] = { 0.0000, 0.0000, 0.0000}
>>>> mem[1] = { 0.0000, 0.0000, 0.0000}
>>>> mem[2] = { 0.0000, 0.0000, 0.0000}
>>>> mem[3] = { 0.0000, 0.0000, 0.0000}
>>>> mem[4] = { 0.0000, 0.0000, 0.0000}
>>>> mem[5] = { 0.0000, 0.0000, 0.0000}
>>>> mem[6] = { 0.0000, 0.0000, 0.0000}
>>>> mem[7] = { 0.0000, 0.0000, 0.0000}
>>>> mem[8] = { 0.0000, 0.0000, 0.0000}
>>>> mem[9] = { 0.0000, 0.0000, 0.0000}
>>>> *** -DO1=1 -DV2=1 ***
>>>> mem[0] = { 5.0000, 0.0000, 0.0000}
>>>> mem[1] = { 0.0000, 0.0000, -1.0000}
>>>> mem[2] = { 0.0000, 0.0000, 0.0000}
>>>> mem[3] = { 0.0000, 0.0000, 0.0000}
>>>> mem[4] = { 0.0000, 0.0000, 0.0000}
>>>> mem[5] = { 0.0000, 0.0000, 0.0000}
>>>> mem[6] = { 0.0000, 0.0000, 0.0000}
>>>> mem[7] = { 0.0000, 0.0000, 0.0000}
>>>> mem[8] = { 0.0000, 0.0000, 0.0000}
>>>> mem[9] = { 0.0000, 0.0000, 0.0000}
>>>> *** -DO2=1 -DV1=1 ***
>>>> mem[0] = { 0.0000, 0.0000, 0.0000}
>>>> mem[1] = { 0.0000, 0.0000, 0.0000}
>>>> mem[2] = { 0.0000, 0.0000, 0.0000}
>>>> mem[3] = { 0.0000, 0.0000, 0.0000}
>>>> mem[4] = { 0.0000, 0.0000, 0.0000}
>>>> mem[5] = { 0.0000, 0.0000, 0.0000}
>>>> mem[6] = { 0.0000, 0.0000, 0.0000}
>>>> mem[7] = { 0.0000, 0.0000, 0.0000}
>>>> mem[8] = { 0.0000, 0.0000, 0.0000}
>>>> mem[9] = { 0.0000, 0.0000, 0.0000}
>>>> *** -DO2=1 -DV2=1 ***
>>>> mem[0] = { 5.0000, 0.0000, 0.0000}
>>>> mem[1] = { 0.0000, 0.0000, -1.0000}
>>>> mem[2] = { 0.0000, 0.0000, 0.0000}
>>>> mem[3] = { 0.0000, 0.0000, 0.0000}
>>>> mem[4] = { 0.0000, 0.0000, 0.0000}
>>>> mem[5] = { 0.0000, 0.0000, 0.0000}
>>>> mem[6] = { 0.0000, 0.0000, 0.0000}
>>>> mem[7] = { 0.0000, 0.0000, 0.0000}
>>>> mem[8] = { 0.0000, 0.0000, 0.0000}
>>>> mem[9] = { 0.0000, 0.0000, 0.0000}
>>>>
>>>>
>>>> Thanks for your help.
>>>>
>>>> Dorian
>>>>
>>>>
>>>>> -----Ursprüngliche Nachricht-----
>>>>> Von: "George Bosilca" <bosilca_at_[hidden]>
>>>>> Gesendet: 12.12.08 01:35:57
>>>>> An: Open MPI Users <users_at_[hidden]>
>>>>> Betreff: Re: [OMPI users] Onesided + derived datatypes
>>>>
>>>>
>>>>> Dorian,
>>>>>
>>>>> You are right, the datatype generated using the block_index
>>>>> function
>>>>> is a legal data-type. We wrongly determined some overlapping
>>>>> regions
>>>>> in the description [which is illegal based on the MPI standard].
>>>>> The
>>>>> detection of such overlapping regions being a very expensive
>>>>> process
>>>>> if we don't want any false positives (such as your datatype), I
>>>>> prefer
>>>>> to remove it completely.
>>>>>
>>>>> To keep it short I just committed a patch (r20120) in the trunk,
>>>>> and
>>>>> I'll take care to move it in the 1.3 and the 1.2.9.
>>>>>
>>>>> Thanks for your help,
>>>>> george.
>>>>>
>>>>> On Dec 10, 2008, at 18:07 , doriankrause wrote:
>>>>>
>>>>>> Hi List,
>>>>>>
>>>>>> I have a MPI program which uses one sided communication with
>>>>>> derived
>>>>>> datatypes (MPI_Type_create_indexed_block). I developed the code
>>>>>> with
>>>>>> MPICH2 and unfortunately didn't thought about trying it out with
>>>>>> OpenMPI. Now that I'm "porting" the Application to OpenMPI I'm
>>>>>> facing
>>>>>> some problems. On the most machines I get an SIGSEGV in
>>>>>> MPI_Win_fence,
>>>>>> sometimes an invalid datatype shows up. I ran the program in
>>>>>> Valgrind
>>>>>> and didn't get anything valuable. Since I can't see a reason
>>>>>> for this
>>>>>> problem (at least if I understand the standard correctly), I
>>>>>> wrote the
>>>>>> attached testprogram.
>>>>>>
>>>>>> Here are my experiences:
>>>>>>
>>>>>> * If I compile without ONESIDED defined, everything works and
>>>>>> V1 and
>>>>>> V2
>>>>>> give the same results
>>>>>> * If I compile with ONESIDED and V2 defined
>>>>>> (MPI_Type_contiguous) it
>>>>>> works.
>>>>>> * ONESIDED + V1 + O2: No errors but obviously nothing is send?
>>>>>> (Am I
>>>>>> in
>>>>>> assuming that V1+O2 and V2 should be equivalent?)
>>>>>> * ONESIDED + V1 + O1:
>>>>>> [m02:03115] *** An error occurred in MPI_Put
>>>>>> [m02:03115] *** on win
>>>>>> [m02:03115] *** MPI_ERR_TYPE: invalid datatype
>>>>>> [m02:03115] *** MPI_ERRORS_ARE_FATAL (goodbye)
>>>>>>
>>>>>> I didn't get a segfault as in the "real life example" but if
>>>>>> ompitest.cc
>>>>>> is correct it means that OpenMPI is buggy when it comes to
>>>>>> onesided
>>>>>> communication and (some) derived datatypes, so that it is
>>>>>> probably not
>>>>>> of problem in my code.
>>>>>>
>>>>>> I'm using OpenMPI-1.2.8 with the newest gcc 4.3.2 but the same
>>>>>> behaviour
>>>>>> can be be seen with gcc-3.3.1 and intel 10.1.
>>>>>>
>>>>>> Please correct me if ompitest.cc contains errors. Otherwise I
>>>>>> would be
>>>>>> glad to hear how I should report these problems to the
>>>>>> develepors (if
>>>>>> they don't read this).
>>>>>>
>>>>>> Thanks + best regards
>>>>>>
>>>>>> Dorian
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> <ompitest.tar.gz>_______________________________________________
>>>>>> users mailing list
>>>>>> users_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>
>>>>
>>>> ____________________________________________________________________
>>>> Psssst! Schon vom neuen WEB.DE MultiMessenger gehört?
>>>> Der kann`s mit allen: http://www.produkte.web.de/messenger/?
>>>> did=3123
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>
>>>
>>> ____________________________________________________________________
>>> Psssst! Schon vom neuen WEB.DE MultiMessenger gehört?
>>> Der kann`s mit allen: http://www.produkte.web.de/messenger/?did=3123
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> Cisco Systems
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users