Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI devel] Fwd: [OMPI users] Onesided + derived datatypes
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2008-12-13 16:11:06


No problem-o.

George -- can you please file a bug?

On Dec 13, 2008, at 3:11 PM, Brian Barrett wrote:

> Sorry, I really won't have time to look until after Christmas. I'll
> put it on the to-do list, but that's as soon as it has a prayer of
> reaching the top.
>
> Brian
>
> On Dec 13, 2008, at 1:02 PM, George Bosilca wrote:
>
>> Brian,
>>
>> I found a second problem with rebuilding the datatype on the
>> remote. Originally, the displacement were wrongly computed. This is
>> now fixed. However, the data at the end of the fence is still not
>> correct on the remote.
>>
>> I can confirm that the packed message contains only 0 instead of
>> the real value, but I couldn't figure out how these 0 got there.
>> The pack function works correctly for the MPI_Send function, I
>> don't see any reason not to do the same for the MPI_Put. As you're
>> the one-sided guy in ompi, can you take a look at the MPI_Put to
>> see why the data is incorrect?
>>
>> george.
>>
>> On Dec 11, 2008, at 19:14 , Brian Barrett wrote:
>>
>>> I think that's a reasonable solution. However, the words "not it"
>>> come to mind. Sorry, but I have way too much on my plate this
>>> month. By the way, in case no one noticed, I had e-mailed my
>>> findings to devel. Someone might want to reply to Dorian's e-mail
>>> on users.
>>>
>>>
>>> Brian
>>>
>>> On Dec 11, 2008, at 2:31 PM, George Bosilca wrote:
>>>
>>>> Brian,
>>>>
>>>> You're right, the datatype is being too cautious with the
>>>> boundaries when detecting the overlap. There is no good solution
>>>> to detect the overlap except parsing the whole memory layout to
>>>> check the status of every predefined type. As one can imagine
>>>> this is a very expensive operation. This is reason I preferred to
>>>> use the true extent and the size of the data to try to detect the
>>>> overlap. This approach is a lot faster, but has a poor accuracy.
>>>>
>>>> The best solution I can think of in short term is to remove
>>>> completely the overlap check. This will have absolutely no impact
>>>> on the way we pack the data, but can lead to unexpected results
>>>> when we unpack and the data overlap. But I guess this can be
>>>> considered as a user error, as the MPI standard clearly state
>>>> that the result of such an operation is ... unexpected.
>>>>
>>>> george.
>>>>
>>>> On Dec 10, 2008, at 22:20 , Brian Barrett wrote:
>>>>
>>>>> Hi all -
>>>>>
>>>>> I looked into this, and it appears to be datatype related. If
>>>>> the displacements are set t o 3, 2, 1, 0, there the datatype
>>>>> will fail the type checks for one-sided because is_overlapped()
>>>>> returns 1 for the datatype. My reading of the standard seems to
>>>>> indicate this should not be. I haven't looked into the problems
>>>>> with displacement set to 0, 1, 2, 3, but I'm guessing it has
>>>>> something to do with the reverse problem.
>>>>>
>>>>> This looks like a datatype issue, so it's out of my realm of
>>>>> expertise. Can someone else take a look?
>>>>>
>>>>> Brian
>>>>>
>>>>> Begin forwarded message:
>>>>>
>>>>>> From: doriankrause <doriankrause_at_[hidden]>
>>>>>> Date: December 10, 2008 4:07:55 PM MST
>>>>>> To: users_at_[hidden]
>>>>>> Subject: [OMPI users] Onesided + derived datatypes
>>>>>> Reply-To: Open MPI Users <users_at_[hidden]>
>>>>>>
>>>>>> Hi List,
>>>>>>
>>>>>> I have a MPI program which uses one sided communication with
>>>>>> derived
>>>>>> datatypes (MPI_Type_create_indexed_block). I developed the code
>>>>>> with
>>>>>> MPICH2 and unfortunately didn't thought about trying it out with
>>>>>> OpenMPI. Now that I'm "porting" the Application to OpenMPI I'm
>>>>>> facing
>>>>>> some problems. On the most machines I get an SIGSEGV in
>>>>>> MPI_Win_fence,
>>>>>> sometimes an invalid datatype shows up. I ran the program in
>>>>>> Valgrind
>>>>>> and didn't get anything valuable. Since I can't see a reason
>>>>>> for this
>>>>>> problem (at least if I understand the standard correctly), I
>>>>>> wrote the
>>>>>> attached testprogram.
>>>>>>
>>>>>> Here are my experiences:
>>>>>>
>>>>>> * If I compile without ONESIDED defined, everything works and
>>>>>> V1 and V2
>>>>>> give the same results
>>>>>> * If I compile with ONESIDED and V2 defined
>>>>>> (MPI_Type_contiguous) it works.
>>>>>> * ONESIDED + V1 + O2: No errors but obviously nothing is send?
>>>>>> (Am I in
>>>>>> assuming that V1+O2 and V2 should be equivalent?)
>>>>>> * ONESIDED + V1 + O1:
>>>>>> [m02:03115] *** An error occurred in MPI_Put
>>>>>> [m02:03115] *** on win
>>>>>> [m02:03115] *** MPI_ERR_TYPE: invalid datatype
>>>>>> [m02:03115] *** MPI_ERRORS_ARE_FATAL (goodbye)
>>>>>>
>>>>>> I didn't get a segfault as in the "real life example" but if
>>>>>> ompitest.cc
>>>>>> is correct it means that OpenMPI is buggy when it comes to
>>>>>> onesided
>>>>>> communication and (some) derived datatypes, so that it is
>>>>>> probably not
>>>>>> of problem in my code.
>>>>>>
>>>>>> I'm using OpenMPI-1.2.8 with the newest gcc 4.3.2 but the same
>>>>>> behaviour
>>>>>> can be be seen with gcc-3.3.1 and intel 10.1.
>>>>>>
>>>>>> Please correct me if ompitest.cc contains errors. Otherwise I
>>>>>> would be
>>>>>> glad to hear how I should report these problems to the
>>>>>> develepors (if
>>>>>> they don't read this).
>>>>>>
>>>>>> Thanks + best regards
>>>>>>
>>>>>> Dorian
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>> <ompitest.tar.gz>
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> users_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> devel_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

-- 
Jeff Squyres
Cisco Systems