Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Edgar Gabriel (gabriel_at_[hidden])
Date: 2006-05-04 10:31:08


Michael, I think you are right, the logic is exactly switched in our
code. I'll commit a fix on the trunk, and I assume that this fix will
also be backported to v1.1. and maybe v1.0

Thanks
Edgar

Michael Kluskens wrote:
> MPI_Intercomm_merge is broken in OpenMPI 1.1a4r9788 (and likely all
> versions)
>
> Details: the second argument, high, of MPI_Intercomm_merge is a
> logical in Fortran (pg 216 of Using MPI) and an int in C. This now
> correct with regards to the f90 interfaces in OpenMPI 1.1. The
> meaning of "high" is as follows (from pg 313 MPI-The Complete
> Reference):
>
> If processes in one group provided the value high = false and
> processes in the other group provided the value high = true then the
> union orders the "low" group before the "high" group.
>
> In other words if I have the following:
> MPI process "parent" calls MPI_Intercomm_merge with high = .false.
> ( high = 0 in C)
> MPI process "child" calls MPI_Intercomm_merge with high = .true.
> (high = 1 in C)
> then in the merged communicator - parent has rank 0 and child has
> rank 1. This not happening in my tests on OS X 10.4.6 with g95;
> however, my two alternative test systems handle this case as I expect
> -- Debian Linux with MPICH2 1.0.3 (g95) and SGI MPI Library (sgi-
> mpt-1.10.1-sgi301r1) (Intel Fotran 9.x).
>
> The following test code is written to use the Fortran 90 interfaces
> but it can be switched to the include file and fixed format source
> code (.f) and should compile with both f90 and f77 compilers. I have
> not written a C test code.
>
> Michael
>
> mpif90 parent4.f90 -o parent4
> mpif90 child4.f90 -o child4
>
> parent startup: 0 of 1
> a child starting
> parent spawned child process
> child 0 of 1
> parent merge comm: 1 of 2
> ERROR: parent rank incorrect after merge
> ERROR: child rank incorrect after merge
>
> -- parent4.f90 --
> program parent4
> USE MPI
> implicit none
> integer ierr,size,rank,child,allmpi
> integer k, subprocesses
>
> call MPI_INIT(ierr)
> call MPI_COMM_RANK(MPI_COMM_WORLD,rank,ierr)
> call MPI_COMM_SIZE(MPI_COMM_WORLD,size,ierr)
>
> write(6,*) 'parent startup: ', rank, ' of ', size
> subprocesses = 1
>
> call MPI_Comm_spawn('child4', MPI_ARGV_NULL,
> subprocesses, &
> & MPI_INFO_NULL, 0, MPI_COMM_WORLD, child,
> MPI_ERRCODES_IGNORE, &
> & ierr )
> write(6,*) 'parent spawned child process'
>
> call MPI_Intercomm_merge( child, .false., allmpi, ierr )
> call MPI_COMM_RANK(allmpi,rank,ierr)
> call MPI_COMM_SIZE(allmpi,size,ierr)
> write(*,'(2(A,I3))') 'parent merge comm:',rank, ' of', size
>
> if ( rank .ne. 0 ) then
> write(6,*) 'ERROR: parent rank incorrect after merge'
> end if
> call MPI_COMM_FREE(allmpi,ierr)
> call MPI_COMM_FREE(child,ierr)
>
> call MPI_FINALIZE(ierr)
> end
> -- child4.f90 --
> program child4
> USE MPI
> implicit none
> integer :: ierr,size,rank,parent,rsize,allmpi
>
> write(*,*) 'a child starting'
> call MPI_INIT(ierr)
> call MPI_COMM_RANK(MPI_COMM_WORLD,rank,ierr)
> call MPI_COMM_SIZE(MPI_COMM_WORLD,size,ierr)
> write(*,'(2(A,I3))') 'child',rank,' of', size
> call MPI_Comm_get_parent(parent,ierr)
>
> call MPI_Intercomm_merge( parent, .true., allmpi, ierr )
> call MPI_COMM_RANK(allmpi,rank,ierr)
> call MPI_COMM_SIZE(allmpi,size,ierr)
> if ( rank .eq. 0 ) then
> write(6,*) 'ERROR: child rank incorrect after merge'
> end if
>
> call MPI_COMM_FREE(allmpi,ierr)
> call MPI_COMM_FREE(parent,ierr)
> call MPI_FINALIZE(ierr)
>
> write(*,'(2(A,I3),A)') 'child',rank,' of',size,' exiting'
> end
> ------------------------------------
>
> On May 2, 2006, at 11:54 PM, Jeff Squyres (jsquyres) wrote:
>
>
>>Ok -- let me know what you find. I just checked and the code *looks*
>>right to me, but that doesn't mean that there isn't some deeper
>>implication that I'm missing.
>>
>>
>>>-----Original Message-----
>>>From: users-bounces_at_[hidden]
>>>[mailto:users-bounces_at_[hidden]] On Behalf Of Michael Kluskens
>>>Sent: Tuesday, May 02, 2006 6:05 PM
>>>To: Open MPI Users
>>>Subject: Re: [OMPI users] openmpi-1.0.2 configure problem
>>>
>>>My test codes compile fine but I'm fairly certain the logical is
>>>being handled incorrectly. When I merge two comm's with one having
>>>high=.false. and the other high=.true., the latter should go
>>>into the
>>>higher ranks and the former should contain rank 0.
>>>
>>>I'll work it over again tomorrow and see if I can create an f77
>>>version or use the mpi.h file and see if I can get a clear
>>>difference
>>>and I'll compare against MPICH2 but someone else should look into
>>>this issue.
>>>
>>>Michael
>>>
>>>On May 1, 2006, at 11:57 PM, Jeff Squyres (jsquyres) wrote:
>>>
>>>
>>>>I just fixed the INTERCOMM_MERGE/logical issue on the trunk
>>>
>>>and the
>>>
>>>>v1.1
>>>>branch -- can you give it a whirl there?
>>>>
>>>>I ask because this issue is a bug that we fixed on the trunk (and
>>>>therefore v1.1) and didn't back-port it to v1.0. There's actually
>>>>quite
>>>>a few of these F90 fixes on the trunk/v1.1 branch that we did not
>>>>back-port to v1.0 (e.g., most of the other logical fixes) mainly
>>>>because
>>>>we thought you were the main consumer of the F90 MPI API (and
>>>>therefore
>>>>it wasn't worth it to back port :-) ). If you need all
>>>
>>>these fixes in
>>>
>>>>v1.0, we can spend the time to do the back-port, but would prefer
>>>>not to
>>>>if possible.
>>>>
>>>>
>>>>
>>>>>-----Original Message-----
>>>>>From: users-bounces_at_[hidden]
>>>>>[mailto:users-bounces_at_[hidden]] On Behalf Of Michael Kluskens
>>>>>Sent: Monday, May 01, 2006 6:20 PM
>>>>>To: Open MPI Users
>>>>>Subject: [OMPI users] openmpi-1.0.2 configure problem
>>>>>
>>>>>checking if FORTRAN compiler supports integer(selected_int_kind
>>>>>(2))... yes
>>>>>checking size of FORTRAN integer(selected_int_kind(2))... unknown
>>>>>configure: WARNING: *** Problem running configure test!
>>>>>configure: WARNING: *** See config.log for details.
>>>>>configure: error: *** Cannot continue.
>>>>>
>>>>>Source code: openmpi-1.0.2 stable
>>>>>OS X 10.4.5 with g95 (Apr 27 2006)
>>>>>./configure F77=g95 FC=g95 LDFLAGS=-lSystemStubs
>>>>>
>>>>>I find this rather surprising given that I have been regularly
>>>>>building nightly snapshots of Open MPI 1.1 and 1.2 (the
>>>
>>>other bug is
>>>
>>>>>preventing me from using them at the moment till either I change my
>>>>>code or the bugs gets fixed).
>>>>>
>>>>>
>>>>
>>>>_______________________________________________
>>>>users mailing list
>>>>users_at_[hidden]
>>>>http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>
>>>_______________________________________________
>>>users mailing list
>>>users_at_[hidden]
>>>http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>>_______________________________________________
>>users mailing list
>>users_at_[hidden]
>>http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users