You're right, the datatype is being too cautious with the boundaries
when detecting the overlap. There is no good solution to detect the
overlap except parsing the whole memory layout to check the status of
every predefined type. As one can imagine this is a very expensive
operation. This is reason I preferred to use the true extent and the
size of the data to try to detect the overlap. This approach is a lot
faster, but has a poor accuracy.
The best solution I can think of in short term is to remove completely
the overlap check. This will have absolutely no impact on the way we
pack the data, but can lead to unexpected results when we unpack and
the data overlap. But I guess this can be considered as a user error,
as the MPI standard clearly state that the result of such an operation
is ... unexpected.
On Dec 10, 2008, at 22:20 , Brian Barrett wrote:
> Hi all -
> I looked into this, and it appears to be datatype related. If the
> displacements are set t o 3, 2, 1, 0, there the datatype will fail
> the type checks for one-sided because is_overlapped() returns 1 for
> the datatype. My reading of the standard seems to indicate this
> should not be. I haven't looked into the problems with displacement
> set to 0, 1, 2, 3, but I'm guessing it has something to do with the
> reverse problem.
> This looks like a datatype issue, so it's out of my realm of
> expertise. Can someone else take a look?
> Begin forwarded message:
>> From: doriankrause <doriankrause_at_[hidden]>
>> Date: December 10, 2008 4:07:55 PM MST
>> To: users_at_[hidden]
>> Subject: [OMPI users] Onesided + derived datatypes
>> Reply-To: Open MPI Users <users_at_[hidden]>
>> Hi List,
>> I have a MPI program which uses one sided communication with derived
>> datatypes (MPI_Type_create_indexed_block). I developed the code with
>> MPICH2 and unfortunately didn't thought about trying it out with
>> OpenMPI. Now that I'm "porting" the Application to OpenMPI I'm facing
>> some problems. On the most machines I get an SIGSEGV in
>> sometimes an invalid datatype shows up. I ran the program in Valgrind
>> and didn't get anything valuable. Since I can't see a reason for this
>> problem (at least if I understand the standard correctly), I wrote
>> attached testprogram.
>> Here are my experiences:
>> * If I compile without ONESIDED defined, everything works and V1
>> and V2
>> give the same results
>> * If I compile with ONESIDED and V2 defined (MPI_Type_contiguous)
>> it works.
>> * ONESIDED + V1 + O2: No errors but obviously nothing is send? (Am
>> I in
>> assuming that V1+O2 and V2 should be equivalent?)
>> * ONESIDED + V1 + O1:
>> [m02:03115] *** An error occurred in MPI_Put
>> [m02:03115] *** on win
>> [m02:03115] *** MPI_ERR_TYPE: invalid datatype
>> [m02:03115] *** MPI_ERRORS_ARE_FATAL (goodbye)
>> I didn't get a segfault as in the "real life example" but if
>> is correct it means that OpenMPI is buggy when it comes to onesided
>> communication and (some) derived datatypes, so that it is probably
>> of problem in my code.
>> I'm using OpenMPI-1.2.8 with the newest gcc 4.3.2 but the same
>> can be be seen with gcc-3.3.1 and intel 10.1.
>> Please correct me if ompitest.cc contains errors. Otherwise I would
>> glad to hear how I should report these problems to the develepors (if
>> they don't read this).
>> Thanks + best regards
>> users mailing list
> devel mailing list