Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Fwd: [OMPI users] Onesided + derived datatypes
From: George Bosilca (bosilca_at_[hidden])
Date: 2008-12-11 19:26:00


Fixed in r20120.

   george.

On Dec 11, 2008, at 19:14 , Brian Barrett wrote:

> I think that's a reasonable solution. However, the words "not it"
> come to mind. Sorry, but I have way too much on my plate this
> month. By the way, in case no one noticed, I had e-mailed my
> findings to devel. Someone might want to reply to Dorian's e-mail
> on users.
>
>
> Brian
>
> On Dec 11, 2008, at 2:31 PM, George Bosilca wrote:
>
>> Brian,
>>
>> You're right, the datatype is being too cautious with the
>> boundaries when detecting the overlap. There is no good solution to
>> detect the overlap except parsing the whole memory layout to check
>> the status of every predefined type. As one can imagine this is a
>> very expensive operation. This is reason I preferred to use the
>> true extent and the size of the data to try to detect the overlap.
>> This approach is a lot faster, but has a poor accuracy.
>>
>> The best solution I can think of in short term is to remove
>> completely the overlap check. This will have absolutely no impact
>> on the way we pack the data, but can lead to unexpected results
>> when we unpack and the data overlap. But I guess this can be
>> considered as a user error, as the MPI standard clearly state that
>> the result of such an operation is ... unexpected.
>>
>> george.
>>
>> On Dec 10, 2008, at 22:20 , Brian Barrett wrote:
>>
>>> Hi all -
>>>
>>> I looked into this, and it appears to be datatype related. If the
>>> displacements are set t o 3, 2, 1, 0, there the datatype will fail
>>> the type checks for one-sided because is_overlapped() returns 1
>>> for the datatype. My reading of the standard seems to indicate
>>> this should not be. I haven't looked into the problems with
>>> displacement set to 0, 1, 2, 3, but I'm guessing it has something
>>> to do with the reverse problem.
>>>
>>> This looks like a datatype issue, so it's out of my realm of
>>> expertise. Can someone else take a look?
>>>
>>> Brian
>>>
>>> Begin forwarded message:
>>>
>>>> From: doriankrause <doriankrause_at_[hidden]>
>>>> Date: December 10, 2008 4:07:55 PM MST
>>>> To: users_at_[hidden]
>>>> Subject: [OMPI users] Onesided + derived datatypes
>>>> Reply-To: Open MPI Users <users_at_[hidden]>
>>>>
>>>> Hi List,
>>>>
>>>> I have a MPI program which uses one sided communication with
>>>> derived
>>>> datatypes (MPI_Type_create_indexed_block). I developed the code
>>>> with
>>>> MPICH2 and unfortunately didn't thought about trying it out with
>>>> OpenMPI. Now that I'm "porting" the Application to OpenMPI I'm
>>>> facing
>>>> some problems. On the most machines I get an SIGSEGV in
>>>> MPI_Win_fence,
>>>> sometimes an invalid datatype shows up. I ran the program in
>>>> Valgrind
>>>> and didn't get anything valuable. Since I can't see a reason for
>>>> this
>>>> problem (at least if I understand the standard correctly), I
>>>> wrote the
>>>> attached testprogram.
>>>>
>>>> Here are my experiences:
>>>>
>>>> * If I compile without ONESIDED defined, everything works and V1
>>>> and V2
>>>> give the same results
>>>> * If I compile with ONESIDED and V2 defined (MPI_Type_contiguous)
>>>> it works.
>>>> * ONESIDED + V1 + O2: No errors but obviously nothing is send?
>>>> (Am I in
>>>> assuming that V1+O2 and V2 should be equivalent?)
>>>> * ONESIDED + V1 + O1:
>>>> [m02:03115] *** An error occurred in MPI_Put
>>>> [m02:03115] *** on win
>>>> [m02:03115] *** MPI_ERR_TYPE: invalid datatype
>>>> [m02:03115] *** MPI_ERRORS_ARE_FATAL (goodbye)
>>>>
>>>> I didn't get a segfault as in the "real life example" but if
>>>> ompitest.cc
>>>> is correct it means that OpenMPI is buggy when it comes to onesided
>>>> communication and (some) derived datatypes, so that it is
>>>> probably not
>>>> of problem in my code.
>>>>
>>>> I'm using OpenMPI-1.2.8 with the newest gcc 4.3.2 but the same
>>>> behaviour
>>>> can be be seen with gcc-3.3.1 and intel 10.1.
>>>>
>>>> Please correct me if ompitest.cc contains errors. Otherwise I
>>>> would be
>>>> glad to hear how I should report these problems to the develepors
>>>> (if
>>>> they don't read this).
>>>>
>>>> Thanks + best regards
>>>>
>>>> Dorian
>>>>
>>>>
>>>>
>>>>
>>> <ompitest.tar.gz>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel