Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Onesided + derived datatypes
From: George Bosilca (bosilca_at_[hidden])
Date: 2008-12-12 20:14:30


Dorian,

I looked into this again. So far I can confirm that the datatype is
correctly created, and always contain the correct values (internally).
If instead of one sided you use send/recv then the output is exactly
what you expect. With the one sided there are several strange things.
What I can say so far is that everything works fine, except when the
block indexed datatype is used as the remote datatype in the MPI_Put
operation. In this case the remote memory is not modified.

   george.

On Dec 12, 2008, at 08:20 , Dorian Krause wrote:

> Hi again.
>
> I adapted my testing program by overwriting the window buffer
> complete with 1. This allows me to see at which places OpenMPI writes.
> The result is:
>
> *** -DO1=1 -DV1=1 *** (displ 3,2,1,0 , MPI_Type_create_indexed_block)
> mem[0] = { 0.0000, 0.0000, 0.0000}
> mem[1] = { 0.0000, 0.0000, 0.0000}
> mem[2] = { 0.0000, 0.0000, 0.0000}
> mem[3] = { nan, nan, nan}
> mem[4] = { nan, nan, nan}
> mem[5] = { nan, nan, nan}
> mem[6] = { nan, nan, nan}
> mem[7] = { nan, nan, nan}
> mem[8] = { nan, nan, nan}
> mem[9] = { nan, nan, nan}
> *** -DO1=1 -DV2=1 *** MPI_Type_contiguous(4, mpi_double3, &mpit)
> mem[0] = { 0.0000, 1.0000, 2.0000}
> mem[1] = { 3.0000, 4.0000, 5.0000}
> mem[2] = { 6.0000, 7.0000, 8.0000}
> mem[3] = { 9.0000, 10.0000, 11.0000}
> mem[4] = { nan, nan, nan}
> mem[5] = { nan, nan, nan}
> mem[6] = { nan, nan, nan}
> mem[7] = { nan, nan, nan}
> mem[8] = { nan, nan, nan}
> mem[9] = { nan, nan, nan}
> *** -DO2=1 -DV1=1 *** (displ 0,1,2,3 , MPI_Type_create_indexed_block)
> mem[0] = { 0.0000, 0.0000, 0.0000}
> mem[1] = { 0.0000, 0.0000, 0.0000}
> mem[2] = { 0.0000, 0.0000, 0.0000}
> mem[3] = { 0.0000, 0.0000, 0.0000}
> mem[4] = { nan, nan, nan}
> mem[5] = { nan, nan, nan}
> mem[6] = { nan, nan, nan}
> mem[7] = { nan, nan, nan}
> mem[8] = { nan, nan, nan}
> mem[9] = { nan, nan, nan}
> *** -DO2=1 -DV2=1 *** MPI_Type_contiguous(4, mpi_double3, &mpit)
> mem[0] = { 0.0000, 1.0000, 2.0000}
> mem[1] = { 3.0000, 4.0000, 5.0000}
> mem[2] = { 6.0000, 7.0000, 8.0000}
> mem[3] = { 9.0000, 10.0000, 11.0000}
> mem[4] = { nan, nan, nan}
> mem[5] = { nan, nan, nan}
> mem[6] = { nan, nan, nan}
> mem[7] = { nan, nan, nan}
> mem[8] = { nan, nan, nan}
> mem[9] = { nan, nan, nan}
>
> Note that for the reversed ordering (3,2,1,0) only 3 lines are
> written. If I use displacements 3,2,1,8
> I get
>
> *** -DO1=1 -DV1=1 ***
> mem[0] = { 0.0000, 0.0000, 0.0000}
> mem[1] = { 0.0000, 0.0000, 0.0000}
> mem[2] = { 0.0000, 0.0000, 0.0000}
> mem[3] = { nan, nan, nan}
> mem[4] = { nan, nan, nan}
> mem[5] = { nan, nan, nan}
> mem[6] = { nan, nan, nan}
> mem[7] = { nan, nan, nan}
> mem[8] = { 0.0000, 0.0000, 0.0000}
> mem[9] = { nan, nan, nan}
>
> but 3,2,8,1 yields
>
> *** -DO1=1 -DV1=1 ***
> mem[0] = { 0.0000, 0.0000, 0.0000}
> mem[1] = { 0.0000, 0.0000, 0.0000}
> mem[2] = { 0.0000, 0.0000, 0.0000}
> mem[3] = { nan, nan, nan}
> mem[4] = { nan, nan, nan}
> mem[5] = { nan, nan, nan}
> mem[6] = { nan, nan, nan}
> mem[7] = { nan, nan, nan}
> mem[8] = { nan, nan, nan}
> mem[9] = { nan, nan, nan}
>
> Dorian
>
>
>> -----Ursprüngliche Nachricht-----
>> Von: "Dorian Krause" <doriankrause_at_[hidden]>
>> Gesendet: 12.12.08 13:49:25
>> An: Open MPI Users <users_at_[hidden]>
>> Betreff: Re: [OMPI users] Onesided + derived datatypes
>
>
>> Thanks George (and Brian :)).
>>
>> The MPI_Put error is gone. Did you take a look at the problem
>> that with the block_indexed type the PUT doesn't work? I'm
>> still getting the following output (V1 corresponds to the datatype
>> created with MPI_Type_create_indexed_block while the V2 type
>> is created with MPI_Type_contiguous, the ordering doesn't care
>> anymore after
>> your fix) which confuses me
>> because I remember that (on one machine) MPI_Put with
>> MPI_Type_create_indexed
>> worked until the invalid datatype error showed up (after a couple
>> of timesteps).
>>
>> *** -DO1=1 -DV1=1 ***
>> mem[0] = { 0.0000, 0.0000, 0.0000}
>> mem[1] = { 0.0000, 0.0000, 0.0000}
>> mem[2] = { 0.0000, 0.0000, 0.0000}
>> mem[3] = { 0.0000, 0.0000, 0.0000}
>> mem[4] = { 0.0000, 0.0000, 0.0000}
>> mem[5] = { 0.0000, 0.0000, 0.0000}
>> mem[6] = { 0.0000, 0.0000, 0.0000}
>> mem[7] = { 0.0000, 0.0000, 0.0000}
>> mem[8] = { 0.0000, 0.0000, 0.0000}
>> mem[9] = { 0.0000, 0.0000, 0.0000}
>> *** -DO1=1 -DV2=1 ***
>> mem[0] = { 5.0000, 0.0000, 0.0000}
>> mem[1] = { 0.0000, 0.0000, -1.0000}
>> mem[2] = { 0.0000, 0.0000, 0.0000}
>> mem[3] = { 0.0000, 0.0000, 0.0000}
>> mem[4] = { 0.0000, 0.0000, 0.0000}
>> mem[5] = { 0.0000, 0.0000, 0.0000}
>> mem[6] = { 0.0000, 0.0000, 0.0000}
>> mem[7] = { 0.0000, 0.0000, 0.0000}
>> mem[8] = { 0.0000, 0.0000, 0.0000}
>> mem[9] = { 0.0000, 0.0000, 0.0000}
>> *** -DO2=1 -DV1=1 ***
>> mem[0] = { 0.0000, 0.0000, 0.0000}
>> mem[1] = { 0.0000, 0.0000, 0.0000}
>> mem[2] = { 0.0000, 0.0000, 0.0000}
>> mem[3] = { 0.0000, 0.0000, 0.0000}
>> mem[4] = { 0.0000, 0.0000, 0.0000}
>> mem[5] = { 0.0000, 0.0000, 0.0000}
>> mem[6] = { 0.0000, 0.0000, 0.0000}
>> mem[7] = { 0.0000, 0.0000, 0.0000}
>> mem[8] = { 0.0000, 0.0000, 0.0000}
>> mem[9] = { 0.0000, 0.0000, 0.0000}
>> *** -DO2=1 -DV2=1 ***
>> mem[0] = { 5.0000, 0.0000, 0.0000}
>> mem[1] = { 0.0000, 0.0000, -1.0000}
>> mem[2] = { 0.0000, 0.0000, 0.0000}
>> mem[3] = { 0.0000, 0.0000, 0.0000}
>> mem[4] = { 0.0000, 0.0000, 0.0000}
>> mem[5] = { 0.0000, 0.0000, 0.0000}
>> mem[6] = { 0.0000, 0.0000, 0.0000}
>> mem[7] = { 0.0000, 0.0000, 0.0000}
>> mem[8] = { 0.0000, 0.0000, 0.0000}
>> mem[9] = { 0.0000, 0.0000, 0.0000}
>>
>>
>> Thanks for your help.
>>
>> Dorian
>>
>>
>>> -----Ursprüngliche Nachricht-----
>>> Von: "George Bosilca" <bosilca_at_[hidden]>
>>> Gesendet: 12.12.08 01:35:57
>>> An: Open MPI Users <users_at_[hidden]>
>>> Betreff: Re: [OMPI users] Onesided + derived datatypes
>>
>>
>>> Dorian,
>>>
>>> You are right, the datatype generated using the block_index function
>>> is a legal data-type. We wrongly determined some overlapping regions
>>> in the description [which is illegal based on the MPI standard]. The
>>> detection of such overlapping regions being a very expensive process
>>> if we don't want any false positives (such as your datatype), I
>>> prefer
>>> to remove it completely.
>>>
>>> To keep it short I just committed a patch (r20120) in the trunk, and
>>> I'll take care to move it in the 1.3 and the 1.2.9.
>>>
>>> Thanks for your help,
>>> george.
>>>
>>> On Dec 10, 2008, at 18:07 , doriankrause wrote:
>>>
>>>> Hi List,
>>>>
>>>> I have a MPI program which uses one sided communication with
>>>> derived
>>>> datatypes (MPI_Type_create_indexed_block). I developed the code
>>>> with
>>>> MPICH2 and unfortunately didn't thought about trying it out with
>>>> OpenMPI. Now that I'm "porting" the Application to OpenMPI I'm
>>>> facing
>>>> some problems. On the most machines I get an SIGSEGV in
>>>> MPI_Win_fence,
>>>> sometimes an invalid datatype shows up. I ran the program in
>>>> Valgrind
>>>> and didn't get anything valuable. Since I can't see a reason for
>>>> this
>>>> problem (at least if I understand the standard correctly), I
>>>> wrote the
>>>> attached testprogram.
>>>>
>>>> Here are my experiences:
>>>>
>>>> * If I compile without ONESIDED defined, everything works and V1
>>>> and
>>>> V2
>>>> give the same results
>>>> * If I compile with ONESIDED and V2 defined (MPI_Type_contiguous)
>>>> it
>>>> works.
>>>> * ONESIDED + V1 + O2: No errors but obviously nothing is send?
>>>> (Am I
>>>> in
>>>> assuming that V1+O2 and V2 should be equivalent?)
>>>> * ONESIDED + V1 + O1:
>>>> [m02:03115] *** An error occurred in MPI_Put
>>>> [m02:03115] *** on win
>>>> [m02:03115] *** MPI_ERR_TYPE: invalid datatype
>>>> [m02:03115] *** MPI_ERRORS_ARE_FATAL (goodbye)
>>>>
>>>> I didn't get a segfault as in the "real life example" but if
>>>> ompitest.cc
>>>> is correct it means that OpenMPI is buggy when it comes to onesided
>>>> communication and (some) derived datatypes, so that it is
>>>> probably not
>>>> of problem in my code.
>>>>
>>>> I'm using OpenMPI-1.2.8 with the newest gcc 4.3.2 but the same
>>>> behaviour
>>>> can be be seen with gcc-3.3.1 and intel 10.1.
>>>>
>>>> Please correct me if ompitest.cc contains errors. Otherwise I
>>>> would be
>>>> glad to hear how I should report these problems to the develepors
>>>> (if
>>>> they don't read this).
>>>>
>>>> Thanks + best regards
>>>>
>>>> Dorian
>>>>
>>>>
>>>>
>>>>
>>>> <ompitest.tar.gz>_______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>>
>> ____________________________________________________________________
>> Psssst! Schon vom neuen WEB.DE MultiMessenger gehört?
>> Der kann`s mit allen: http://www.produkte.web.de/messenger/?did=3123
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
> ____________________________________________________________________
> Psssst! Schon vom neuen WEB.DE MultiMessenger gehört?
> Der kann`s mit allen: http://www.produkte.web.de/messenger/?did=3123
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users