Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Onesided + derived datatypes
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2008-12-13 07:59:19


George -- you had a commit after this (r20123) -- did that fix the
problem?

On Dec 12, 2008, at 8:14 PM, George Bosilca wrote:

> Dorian,
>
> I looked into this again. So far I can confirm that the datatype is
> correctly created, and always contain the correct values
> (internally). If instead of one sided you use send/recv then the
> output is exactly what you expect. With the one sided there are
> several strange things. What I can say so far is that everything
> works fine, except when the block indexed datatype is used as the
> remote datatype in the MPI_Put operation. In this case the remote
> memory is not modified.
>
> george.
>
>
> On Dec 12, 2008, at 08:20 , Dorian Krause wrote:
>
>> Hi again.
>>
>> I adapted my testing program by overwriting the window buffer
>> complete with 1. This allows me to see at which places OpenMPI
>> writes.
>> The result is:
>>
>> *** -DO1=1 -DV1=1 *** (displ 3,2,1,0 , MPI_Type_create_indexed_block)
>> mem[0] = { 0.0000, 0.0000, 0.0000}
>> mem[1] = { 0.0000, 0.0000, 0.0000}
>> mem[2] = { 0.0000, 0.0000, 0.0000}
>> mem[3] = { nan, nan, nan}
>> mem[4] = { nan, nan, nan}
>> mem[5] = { nan, nan, nan}
>> mem[6] = { nan, nan, nan}
>> mem[7] = { nan, nan, nan}
>> mem[8] = { nan, nan, nan}
>> mem[9] = { nan, nan, nan}
>> *** -DO1=1 -DV2=1 *** MPI_Type_contiguous(4, mpi_double3, &mpit)
>> mem[0] = { 0.0000, 1.0000, 2.0000}
>> mem[1] = { 3.0000, 4.0000, 5.0000}
>> mem[2] = { 6.0000, 7.0000, 8.0000}
>> mem[3] = { 9.0000, 10.0000, 11.0000}
>> mem[4] = { nan, nan, nan}
>> mem[5] = { nan, nan, nan}
>> mem[6] = { nan, nan, nan}
>> mem[7] = { nan, nan, nan}
>> mem[8] = { nan, nan, nan}
>> mem[9] = { nan, nan, nan}
>> *** -DO2=1 -DV1=1 *** (displ 0,1,2,3 , MPI_Type_create_indexed_block)
>> mem[0] = { 0.0000, 0.0000, 0.0000}
>> mem[1] = { 0.0000, 0.0000, 0.0000}
>> mem[2] = { 0.0000, 0.0000, 0.0000}
>> mem[3] = { 0.0000, 0.0000, 0.0000}
>> mem[4] = { nan, nan, nan}
>> mem[5] = { nan, nan, nan}
>> mem[6] = { nan, nan, nan}
>> mem[7] = { nan, nan, nan}
>> mem[8] = { nan, nan, nan}
>> mem[9] = { nan, nan, nan}
>> *** -DO2=1 -DV2=1 *** MPI_Type_contiguous(4, mpi_double3, &mpit)
>> mem[0] = { 0.0000, 1.0000, 2.0000}
>> mem[1] = { 3.0000, 4.0000, 5.0000}
>> mem[2] = { 6.0000, 7.0000, 8.0000}
>> mem[3] = { 9.0000, 10.0000, 11.0000}
>> mem[4] = { nan, nan, nan}
>> mem[5] = { nan, nan, nan}
>> mem[6] = { nan, nan, nan}
>> mem[7] = { nan, nan, nan}
>> mem[8] = { nan, nan, nan}
>> mem[9] = { nan, nan, nan}
>>
>> Note that for the reversed ordering (3,2,1,0) only 3 lines are
>> written. If I use displacements 3,2,1,8
>> I get
>>
>> *** -DO1=1 -DV1=1 ***
>> mem[0] = { 0.0000, 0.0000, 0.0000}
>> mem[1] = { 0.0000, 0.0000, 0.0000}
>> mem[2] = { 0.0000, 0.0000, 0.0000}
>> mem[3] = { nan, nan, nan}
>> mem[4] = { nan, nan, nan}
>> mem[5] = { nan, nan, nan}
>> mem[6] = { nan, nan, nan}
>> mem[7] = { nan, nan, nan}
>> mem[8] = { 0.0000, 0.0000, 0.0000}
>> mem[9] = { nan, nan, nan}
>>
>> but 3,2,8,1 yields
>>
>> *** -DO1=1 -DV1=1 ***
>> mem[0] = { 0.0000, 0.0000, 0.0000}
>> mem[1] = { 0.0000, 0.0000, 0.0000}
>> mem[2] = { 0.0000, 0.0000, 0.0000}
>> mem[3] = { nan, nan, nan}
>> mem[4] = { nan, nan, nan}
>> mem[5] = { nan, nan, nan}
>> mem[6] = { nan, nan, nan}
>> mem[7] = { nan, nan, nan}
>> mem[8] = { nan, nan, nan}
>> mem[9] = { nan, nan, nan}
>>
>> Dorian
>>
>>
>>> -----Ursprüngliche Nachricht-----
>>> Von: "Dorian Krause" <doriankrause_at_[hidden]>
>>> Gesendet: 12.12.08 13:49:25
>>> An: Open MPI Users <users_at_[hidden]>
>>> Betreff: Re: [OMPI users] Onesided + derived datatypes
>>
>>
>>> Thanks George (and Brian :)).
>>>
>>> The MPI_Put error is gone. Did you take a look at the problem
>>> that with the block_indexed type the PUT doesn't work? I'm
>>> still getting the following output (V1 corresponds to the datatype
>>> created with MPI_Type_create_indexed_block while the V2 type
>>> is created with MPI_Type_contiguous, the ordering doesn't care
>>> anymore after
>>> your fix) which confuses me
>>> because I remember that (on one machine) MPI_Put with
>>> MPI_Type_create_indexed
>>> worked until the invalid datatype error showed up (after a couple
>>> of timesteps).
>>>
>>> *** -DO1=1 -DV1=1 ***
>>> mem[0] = { 0.0000, 0.0000, 0.0000}
>>> mem[1] = { 0.0000, 0.0000, 0.0000}
>>> mem[2] = { 0.0000, 0.0000, 0.0000}
>>> mem[3] = { 0.0000, 0.0000, 0.0000}
>>> mem[4] = { 0.0000, 0.0000, 0.0000}
>>> mem[5] = { 0.0000, 0.0000, 0.0000}
>>> mem[6] = { 0.0000, 0.0000, 0.0000}
>>> mem[7] = { 0.0000, 0.0000, 0.0000}
>>> mem[8] = { 0.0000, 0.0000, 0.0000}
>>> mem[9] = { 0.0000, 0.0000, 0.0000}
>>> *** -DO1=1 -DV2=1 ***
>>> mem[0] = { 5.0000, 0.0000, 0.0000}
>>> mem[1] = { 0.0000, 0.0000, -1.0000}
>>> mem[2] = { 0.0000, 0.0000, 0.0000}
>>> mem[3] = { 0.0000, 0.0000, 0.0000}
>>> mem[4] = { 0.0000, 0.0000, 0.0000}
>>> mem[5] = { 0.0000, 0.0000, 0.0000}
>>> mem[6] = { 0.0000, 0.0000, 0.0000}
>>> mem[7] = { 0.0000, 0.0000, 0.0000}
>>> mem[8] = { 0.0000, 0.0000, 0.0000}
>>> mem[9] = { 0.0000, 0.0000, 0.0000}
>>> *** -DO2=1 -DV1=1 ***
>>> mem[0] = { 0.0000, 0.0000, 0.0000}
>>> mem[1] = { 0.0000, 0.0000, 0.0000}
>>> mem[2] = { 0.0000, 0.0000, 0.0000}
>>> mem[3] = { 0.0000, 0.0000, 0.0000}
>>> mem[4] = { 0.0000, 0.0000, 0.0000}
>>> mem[5] = { 0.0000, 0.0000, 0.0000}
>>> mem[6] = { 0.0000, 0.0000, 0.0000}
>>> mem[7] = { 0.0000, 0.0000, 0.0000}
>>> mem[8] = { 0.0000, 0.0000, 0.0000}
>>> mem[9] = { 0.0000, 0.0000, 0.0000}
>>> *** -DO2=1 -DV2=1 ***
>>> mem[0] = { 5.0000, 0.0000, 0.0000}
>>> mem[1] = { 0.0000, 0.0000, -1.0000}
>>> mem[2] = { 0.0000, 0.0000, 0.0000}
>>> mem[3] = { 0.0000, 0.0000, 0.0000}
>>> mem[4] = { 0.0000, 0.0000, 0.0000}
>>> mem[5] = { 0.0000, 0.0000, 0.0000}
>>> mem[6] = { 0.0000, 0.0000, 0.0000}
>>> mem[7] = { 0.0000, 0.0000, 0.0000}
>>> mem[8] = { 0.0000, 0.0000, 0.0000}
>>> mem[9] = { 0.0000, 0.0000, 0.0000}
>>>
>>>
>>> Thanks for your help.
>>>
>>> Dorian
>>>
>>>
>>>> -----Ursprüngliche Nachricht-----
>>>> Von: "George Bosilca" <bosilca_at_[hidden]>
>>>> Gesendet: 12.12.08 01:35:57
>>>> An: Open MPI Users <users_at_[hidden]>
>>>> Betreff: Re: [OMPI users] Onesided + derived datatypes
>>>
>>>
>>>> Dorian,
>>>>
>>>> You are right, the datatype generated using the block_index
>>>> function
>>>> is a legal data-type. We wrongly determined some overlapping
>>>> regions
>>>> in the description [which is illegal based on the MPI standard].
>>>> The
>>>> detection of such overlapping regions being a very expensive
>>>> process
>>>> if we don't want any false positives (such as your datatype), I
>>>> prefer
>>>> to remove it completely.
>>>>
>>>> To keep it short I just committed a patch (r20120) in the trunk,
>>>> and
>>>> I'll take care to move it in the 1.3 and the 1.2.9.
>>>>
>>>> Thanks for your help,
>>>> george.
>>>>
>>>> On Dec 10, 2008, at 18:07 , doriankrause wrote:
>>>>
>>>>> Hi List,
>>>>>
>>>>> I have a MPI program which uses one sided communication with
>>>>> derived
>>>>> datatypes (MPI_Type_create_indexed_block). I developed the code
>>>>> with
>>>>> MPICH2 and unfortunately didn't thought about trying it out with
>>>>> OpenMPI. Now that I'm "porting" the Application to OpenMPI I'm
>>>>> facing
>>>>> some problems. On the most machines I get an SIGSEGV in
>>>>> MPI_Win_fence,
>>>>> sometimes an invalid datatype shows up. I ran the program in
>>>>> Valgrind
>>>>> and didn't get anything valuable. Since I can't see a reason for
>>>>> this
>>>>> problem (at least if I understand the standard correctly), I
>>>>> wrote the
>>>>> attached testprogram.
>>>>>
>>>>> Here are my experiences:
>>>>>
>>>>> * If I compile without ONESIDED defined, everything works and V1
>>>>> and
>>>>> V2
>>>>> give the same results
>>>>> * If I compile with ONESIDED and V2 defined
>>>>> (MPI_Type_contiguous) it
>>>>> works.
>>>>> * ONESIDED + V1 + O2: No errors but obviously nothing is send?
>>>>> (Am I
>>>>> in
>>>>> assuming that V1+O2 and V2 should be equivalent?)
>>>>> * ONESIDED + V1 + O1:
>>>>> [m02:03115] *** An error occurred in MPI_Put
>>>>> [m02:03115] *** on win
>>>>> [m02:03115] *** MPI_ERR_TYPE: invalid datatype
>>>>> [m02:03115] *** MPI_ERRORS_ARE_FATAL (goodbye)
>>>>>
>>>>> I didn't get a segfault as in the "real life example" but if
>>>>> ompitest.cc
>>>>> is correct it means that OpenMPI is buggy when it comes to
>>>>> onesided
>>>>> communication and (some) derived datatypes, so that it is
>>>>> probably not
>>>>> of problem in my code.
>>>>>
>>>>> I'm using OpenMPI-1.2.8 with the newest gcc 4.3.2 but the same
>>>>> behaviour
>>>>> can be be seen with gcc-3.3.1 and intel 10.1.
>>>>>
>>>>> Please correct me if ompitest.cc contains errors. Otherwise I
>>>>> would be
>>>>> glad to hear how I should report these problems to the
>>>>> develepors (if
>>>>> they don't read this).
>>>>>
>>>>> Thanks + best regards
>>>>>
>>>>> Dorian
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> <ompitest.tar.gz>_______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>
>>>
>>> ____________________________________________________________________
>>> Psssst! Schon vom neuen WEB.DE MultiMessenger gehört?
>>> Der kann`s mit allen: http://www.produkte.web.de/messenger/?did=3123
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>>
>> ____________________________________________________________________
>> Psssst! Schon vom neuen WEB.DE MultiMessenger gehört?
>> Der kann`s mit allen: http://www.produkte.web.de/messenger/?did=3123
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
Cisco Systems