Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

From: George Bosilca (bosilca_at_[hidden])
Date: 2006-02-12 15:29:12


Yvan,

It's now corrected. Please use the trunk (nightly builds) starting from
revision 8997 or wait 'til monday when we will update the next stable
candidate. If you are in a hurry and feel like playing around with the MPI
code, you can apply the attached patch to the latest stable.

   Thanks,
     george.

On Fri, 10 Feb 2006, George Bosilca wrote:

> Yvan,
>
> I'm looking into this one. So far I cannot reproduce it with the
> current version from the trunk. I will look into the stable versions.
> Until I figure out what's wrong, can you please use the nightly
> builds to run your test. Once the problem get fixed it will be
> included in the 1.0.2 release.
>
> BTW, which interconnect are you using ? Ethernet ?
>
> Thanks,
> george.
>
> On Feb 10, 2006, at 5:06 PM, Yvan Fournier wrote:
>
>> Hello,
>>
>> I seem to have encountered a bug in Open MPI 1.0 using indexed
>> datatypes
>> with MPI_Recv (which seems to be of the "off by one" sort). I have
>> joined a test case, which is briefly explained below (as well as in
>> the
>> source file). This case should run on two processes. I observed the
>> bug
>> on 2 different Linux systems (single processor Centrino under Suse
>> 10.0
>> with gcc 4.0.2, dual-processor Xeon under Debian Sarge with gcc 3.4)
>> with Open MPI 1.0.1, and do not observe it using LAM 7.1.1 or MPICH2.
>>
>> Here is a summary of the case:
>>
>> ------------------
>>
>> Each processor reads a file ("data_p0" or "data_p1") giving a list of
>> global element ids. Some elements (vertices from a partitionned mesh)
>> may belong to both processors, so their id's may appear on both
>> processors: we have 7178 global vertices, 3654 and 3688 of them being
>> known by ranks 0 and 1 respectively.
>>
>> In this simplified version, we assign coordinates {x, y, z} to each
>> vertex equal to it's global id number for rank 1, and the negative of
>> that for rank 0 (assigning the same values to x, y, and z). After
>> finishing the "ordered gather", rank 0 prints the global id and
>> coordinates of each vertex.
>>
>> lines should print (for example) as:
>> 6456 ; 6455.00000 6455.00000 6456.00000
>> 6457 ; -6457.00000 -6457.00000 -6457.00000
>> depending on whether a vertex belongs only to rank 0 (negative
>> coordinates) or belongs to rank 1 (positive coordinates).
>>
>> With the OMPI 1.0.1 bug (observed on Suse Linux 10.0 with gcc 4.0
>> and on
>> Debian sarge with gcc 3.4), we have for example for the last vertices:
>> 7176 ; 7175.00000 7175.00000 7176.00000
>> 7177 ; 7176.00000 7176.00000 7177.00000
>> seeming to indicate an "off by one" type bug in datatype handling
>>
>> Not using an indexed datatype (i.e. not defining USE_INDEXED_DATATYPE
>> in the gather_test.c file), the bug dissapears. Using the indexed
>> datatype with LAM MPI 7.1.1 or MPICH2, we do not reproduce the bug
>> either, so it does seem to be an Open MPI issue.
>>
>> ------------------
>>
>> Best regards,
>>
>> Yvan Fournier
>> <ompi_datatype_bug.tar.gz>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> "Half of what I say is meaningless; but I say it so that the other
> half may reach you"
> Kahlil Gibran
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

"We must accept finite disappointment, but we must never lose infinite
hope."
                                   Martin Luther King