Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] OMPI users] MPI_IN_PLACE in FortranwithMPI_REDUCE / MPI_ALLREDUCE
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2009-08-04 10:19:05


Hmm. I can now replicate this on OSX as well, but I'm not sure I
agree with all of your analysis. Here's what I get from an OMPI SVN
trunk build:

[9:34] rtp-jsquyres-8718:~/bogus/lib % foreach file (`ls *.0.dylib`)
foreach? echo ================= $file
foreach? nm $file | grep in_place
foreach? end
================= libmca_common_sm.0.dylib
================= libmpi.0.dylib
0011a638 D _MPI_FORTRAN_IN_PLACE
0011a634 D _mpi_fortran_in_place
0011a63c D _mpi_fortran_in_place_
0011a640 D _mpi_fortran_in_place__
================= libmpi_cxx.0.dylib
00008144 S __ZN3MPI8IN_PLACEE
================= libmpi_f77.0.dylib
          U _MPI_FORTRAN_IN_PLACE
          U _mpi_fortran_in_place
          U _mpi_fortran_in_place_
          U _mpi_fortran_in_place__
================= libopen-pal.0.dylib
================= libopen-rte.0.dylib
0007f2b4 D _orte_snapc_base_store_in_place

The __Z symbol is in libmpi_cxx, so I don't think it's relevant here
(that's the part that I disagree about). But notice that my
*fortran_in_place*/i symbols are "D" in libmpi (where they are
defined) and U in libmpi_f77. This is different than your output.

Here's the output from a 1.3.3 build:

[9:55] rtp-jsquyres-8718:~/bogus/1.3/lib % !for
foreach file ( `ls *.0.dylib` )
foreach? echo =========== $file
foreach? nm $file | grep in_place
foreach? end
=========== libmca_common_sm.0.dylib
=========== libmpi.0.dylib
000a4d30 S _MPI_FORTRAN_IN_PLACE
000a4d34 S _mpi_fortran_in_place
000a4d38 S _mpi_fortran_in_place_
000a4d3c S _mpi_fortran_in_place__
=========== libmpi_cxx.0.dylib
00007328 S __ZN3MPI8IN_PLACEE
=========== libmpi_f77.0.dylib
          U _mpi_fortran_in_place_
=========== libopen-pal.0.dylib
=========== libopen-rte.0.dylib
00036eea D _orte_snapc_base_store_in_place

Here's the output from a 1.2.9 build:

[9:35] rtp-jsquyres-8718:~/bogus/1.2/lib % foreach file ( `ls *.
0.dylib` )
foreach? echo ============= $file
foreach? nm $file | grep in_place
foreach? end
============= libmca_common_sm.0.dylib
============= libmpi.0.dylib
00093950 S _MPI_FORTRAN_IN_PLACE
00093954 S _mpi_fortran_in_place
00093958 S _mpi_fortran_in_place_
0009395c S _mpi_fortran_in_place__
============= libmpi_cxx.0.dylib
0000e00c D __ZN3MPI8IN_PLACEE
============= libmpi_f77.0.dylib
          U _mpi_fortran_in_place_
============= libopen-pal.0.dylib
============= libopen-rte.0.dylib

Notes:

1. I can't see lib libmpi_cxx affects anything in the f90 app
2. The trunk builds have the symbols as D's, but the 1.2 and 1.3
builds have them as S's.
3. Build and run with 1.2 works, build and run with 1.3 fails.
Inserting output statements in the runs, I can see that 1.2 correctly
detects MPI_IN_PLACE but 1.3 and trunk do not.

So it's something more than S vs. D, and I don't believe that the
libmpi_cxx symbols is involved. This is definitely a bug. Doh! With
a *brief* code examination, I don't see any substantive code changes
between 1.2.x and the SVN trunk/v1.3, but we definitely did change
versions of Libtool. I wonder if this is involved somehow.

I don't have the cycles at the moment to investigate, but I've filed a
blocker ticket against OMPI 1.3.4:

     https://svn.open-mpi.org/trac/ompi/ticket/1982

I made it a blocker because I assume this also affects all the other
"special" constants in Fortran, like MPI_BOTTOM.

On Aug 4, 2009, at 5:38 AM, Ricardo Fonseca wrote:

> Hi Jeff
>
> This is a Mac OS X (10.5.7) specific issue, that occurs for all
> versions > 1.2.9 that I've tested (1.3.0 through the 1.4 nightly),
> regardless of what fortran compiler you use (ifort / g95 /
> gfortran). I've been able to replicate this issue on other OS X
> machines, and I am sure that I am using the correct headers /
> libraries. Version 1.2.9 is working correctly. Here are some system
> details:
>
> $ uname -a
> Darwin zamblap.epp.ist.utl.pt 9.7.0 Darwin Kernel Version 9.7.0: Tue
> Mar 31 22:52:17 PDT 2009; root:xnu-1228.12.14~1/RELEASE_I386 i386
>
> $ gcc --version
> i686-apple-darwin9-gcc-4.0.1 (GCC) 4.0.1 (Apple Inc. build 5493)
>
> $ ld -v
> @(#)PROGRAM:ld PROJECT:ld64-85.2.1
>
> This might be a (again, Mac OS X specific) libtool issue. If you
> look at the name list of the generated .dylib libraries for 1.3.3
> you get:
>
> $ nm /opt/openmpi/1.3.3-g95-32/lib/*.dylib | grep -i in_place
> 000a4d30 S _MPI_FORTRAN_IN_PLACE
> 000a4d34 S _mpi_fortran_in_place
> 000a4d38 S _mpi_fortran_in_place_
> 000a4d3c S _mpi_fortran_in_place__
> 000a4d30 S _MPI_FORTRAN_IN_PLACE
> 000a4d34 S _mpi_fortran_in_place
> 000a4d38 S _mpi_fortran_in_place_
> 000a4d3c S _mpi_fortran_in_place__
> 00007328 S __ZN3MPI8IN_PLACEE
> 00007328 S __ZN3MPI8IN_PLACEE
> U _mpi_fortran_in_place__
> U _mpi_fortran_in_place__
> 00036eea D _orte_snapc_base_store_in_place
> 00036eea D _orte_snapc_base_store_in_place
>
> But for 1.2.9 you get:
>
> $ nm /opt/openmpi/1.2.9-g95-32/lib/*.dylib | grep -i in_place
> 00093950 S _MPI_FORTRAN_IN_PLACE
> 00093954 S _mpi_fortran_in_place
> 00093958 S _mpi_fortran_in_place_
> 0009395c S _mpi_fortran_in_place__
> 00093950 S _MPI_FORTRAN_IN_PLACE
> 00093954 S _mpi_fortran_in_place
> 00093958 S _mpi_fortran_in_place_
> 0009395c S _mpi_fortran_in_place__
> 0000e00c D __ZN3MPI8IN_PLACEE
> 0000e00c D __ZN3MPI8IN_PLACEE
> U _mpi_fortran_in_place__
> U _mpi_fortran_in_place__
>
> So the __ZN3MPI8IN_PLACEE symbol, that I guess refers to the Fortran
> MPI_IN_PLACE constant is being defined incorrectly in the 1.3.3
> version as a S (symbol in a section other than those above), while
> it should be defined as a D (data section symbol) as part of an
> "external" common block, as it happens in 1.2.9. So when linking the
> 1.3.3 version the MPI_IN_PLACE constant will never have the same
> address as any of the mpi_fortran_in_place variables, but rather its
> own address.
>
> Thanks again for your help,
> Ricardo
>
> ---
> Prof. Ricardo Fonseca
>
> GoLP - Grupo de Lasers e Plasmas
> Instituto de Plasmas e Fusão Nuclear
> Instituto Superior Técnico
> Av. Rovisco Pais
> 1049-001 Lisboa
> Portugal
>
> tel: +351 21 8419202
> fax: +351 21 8464455
> web: http://cfp.ist.utl.pt/golp/
>
> On Aug 1, 2009, at 17:00 , users-request_at_[hidden] wrote:
>
>> Message: 2
>> Date: Sat, 1 Aug 2009 07:44:47 -0400
>> From: Jeff Squyres <jsquyres_at_[hidden]>
>> Subject: Re: [OMPI users] OMPI users] MPI_IN_PLACE in Fortran
>> withMPI_REDUCE / MPI_ALLREDUCE
>> To: Open MPI Users <users_at_[hidden]>
>> Message-ID: <CA25CCF4-C5E7-47C0-A24E-8B05B59A6474_at_[hidden]>
>> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
>>
>> Hmm. FWIW, I'm unable to replicate your error. I tried with the
>> OMPI
>> SVN trunk and a build of the OMPI 1.3.3 tarball using the GNU
>> compiler
>> suite on RHEL4U5.
>>
>> I've even compiled your sample code with "mpif90" using the "use mpi"
>> statement -- I did not get an unclassifiable statement. What version
>> of Open MPI are you using? Please sent the info listed here:
>>
>> http://www.open-mpi.org/community/help/
>>
>> Can you confirm that you're not accidentally mixing and matching
>> multiple versions of Open MPI?
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
jsquyres_at_[hidden]