Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] MPI_Allreduce on local machine
From: Gus Correa (gus_at_[hidden])
Date: 2010-07-28 17:07:29


Hi All

Martin Siegert wrote:
> On Wed, Jul 28, 2010 at 01:05:52PM -0700, Martin Siegert wrote:
>> On Wed, Jul 28, 2010 at 11:19:43AM -0400, Gus Correa wrote:
>>> Hugo Gagnon wrote:
>>>> Hi Gus,
>>>> Ompi_info --all lists its info regarding fortran right after C. In my
>>>> case:
>>>> Fort real size: 4
>>>> Fort real4 size: 4
>>>> Fort real8 size: 8
>>>> Fort real16 size: 16
>>>> Fort dbl prec size: 4
>>>> Does it make any sense to you?
>>> Hi Hugo
>>>
>>> No, dbl prec size 4 sounds weird, should be 8, I suppose,
>>> same as real8, right?
>>>
>>> It doesn't make sense, but that's what I have (now that you told me
>>> that "dbl" , not "double", is the string to search for):
>>>
>>> $ Fort dbl prec size: 4
>>> Fort dbl cplx size: 4
>>> Fort dbl prec align: 4
>>> Fort dbl cplx align: 4
>>>
>>> Is this a bug in OpenMPI perhaps?
>>>
>>> I didn't come across to this problem, most likely because
>>> the codes here don't use "double precision" but real*8 or similar.
>>>
>>> Also make sure you are picking the right ompi_info, mpif90/f77, mpiexec.
>>> Often times old versions and tangled PATH make things very confusing.
>> This is indeed worrisome as I confirm the findings on our clusters both
>> with ompi 1.3.3 and 1.4.1:
>>
>> ompi_info --all | grep -i fort
>> ...
>> Fort real size: 4
>> Fort real4 size: 4
>> Fort real8 size: 8
>> Fort real16 size: -1
>> Fort dbl prec size: 4
>> Fort cplx size: 4
>> Fort dbl cplx size: 4
>> Fort cplx8 size: 8
>> Fort cplx16 size: 16
>> Fort cplx32 size: -1
>> Fort integer align: 4
>> Fort integer1 align: 1
>> Fort integer2 align: 2
>> Fort integer4 align: 4
>> Fort integer8 align: 8
>> Fort integer16 align: -1
>> Fort real align: 4
>> Fort real4 align: 4
>> Fort real8 align: 8
>> Fort real16 align: -1
>> Fort dbl prec align: 4
>> Fort cplx align: 4
>> Fort dbl cplx align: 4
>> Fort cplx8 align: 4
>> Fort cplx16 align: 8
>> ...
>>
>> And this is the configure output:
>> checking if Fortran 77 compiler supports REAL*8... yes
>> checking size of Fortran 77 REAL*8... 8
>> checking for C type corresponding to REAL*8... double
>> checking alignment of Fortran REAL*8... 1
>> ...
>> checking if Fortran 77 compiler supports DOUBLE PRECISION... yes
>> checking size of Fortran 77 DOUBLE PRECISION... 8
>> checking for C type corresponding to DOUBLE PRECISION... double
>> checking alignment of Fortran DOUBLE PRECISION... 1
>>
>> But the following code actually appears to give the correct results:
>>
>> program types
>> use mpi
>> implicit none
>> integer :: mpierr, size
>>
>> call MPI_Init(mpierr)
>> call MPI_Type_size(MPI_DOUBLE_PRECISION, size, mpierr)
>> print*, 'double precision size: ', size
>> call MPI_Finalize(mpierr)
>> end
>>
>> mpif90 -g types.f90
>> mpiexec -n 1 ./a.out
>> double precision size: 8
>>
>> Thus is this a bug in ompi_info only?
>
> answering my own question:
> This does not look right:
>
> ompi/tools/ompi_info/param.cc:
>
> out("Fort dbl prec size",
> "compiler:fortran:sizeof:double_precision",
> OMPI_SIZEOF_FORTRAN_REAL);
>
> that should be OMPI_SIZEOF_FORTRAN_DOUBLE_PRECISION.
>
> - Martin

Hopefully Martin may got it and the issue is restricted to ompi_info.
Thanks, Martin, for writing and running the little diagnostic code,
and for checking the ompi_info guts!

Still, the alignment under Intel may or may not be right.
And this may or may not explain the errors that Hugo has got.

FYI, the ompi_info from my OpenMPI 1.3.2 and 1.2.8
report exactly the same as OpenMPI 1.4.2, namely
Fort dbl prec size: 4 and
Fort dbl prec align: 4,
except that *if the Intel Fortran compiler (ifort) was used*
I get 1 byte alignment:
Fort dbl prec align: 1

So, this issue has been around for a while,
and involves both the size and the alignment (in Intel)
of double precision.

We have a number of pieces of code here where grep shows
MPI_DOUBLE_PRECISION.
Not sure how much of it has actually been active, as there are always
lots of cpp directives to select active code.

In particular I found this interesting snippet:

     if (MPI_DOUBLE_PRECISION==20 .and. MPI_REAL8==18) then
        ! LAM MPI defined MPI_REAL8 differently from MPI_DOUBLE_PRECISION
        ! and LAM MPI's allreduce does not accept on MPI_REAL8
        MPIreal_t = MPI_DOUBLE_PRECISION
     else
        MPIreal_t = MPI_REAL8
     endif

where eventually MPIreal_t is what is used as
the MPI type in some MPI calls, particularly in MPI_Allreduce,
which is the one that triggered all this discussion
(see this thread Subject line) when Hugo first
asked his original question.

Hopefully the if branch on the code snippet above worked alright,
because here in our OpenMPIs 1.4.2, 1.3.2, and 1.2.8,
MPI_DOUBLE_PRECISION value is 17,
which should have safely produced
MPIreal_t = MPI_REAL8

I have a lot more of code to check, but maybe not.
If the issue is really restricted to ompi_info that would be a
big relief.

Many thanks,
Gus Correa
---------------------------------------------------------------------
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------

> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users