Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI devel] Inherent limit on #communicators?
From: Brian W. Barrett (brbarret_at_[hidden])
Date: 2009-05-01 11:44:26


Ugh - I'll fix today.

Brian

On Fri, 1 May 2009, Ralph Castain wrote:

> BTW: when compiling Brian's change, I got a warning about comparing
> signed and unsigned. Sure enough, I found that the communicator id is
> defined as an unsigned int, while the PML is treating it as a *signed*
> int.
>
> We need to get this corrected - which way do you want it to be?
>
> I will add this requirement to the ticket...
>
> Thanks
> Ralph
>
>
> On Fri, May 1, 2009 at 6:38 AM, Ralph Castain <rhc_at_[hidden]> wrote:
> I'm not entirely sure if David is going to be in today, so I
> will answer for him (and let him correct me later!).
>
> This code is indeed representative of what the app is doing.
> Basically, the user repeatedly splits the communicator so he
> can run mini test cases before going on to the larger
> computation. So it is always the base communicator being
> repeatedly split and freed.
>
> I would suspect, therefore, that the quick fix would serve us
> just fine while the worst case is later resolved.
>
> Thanks
> Ralph
>
>
> On Fri, May 1, 2009 at 6:08 AM, Edgar Gabriel <gabriel_at_[hidden]>
> wrote:
> David,
>
> is this code representative for what your app is doing?
> E.g. you have a base communicator (e.g. MPI_COMM_WORLD)
> which is being 'split', freed again, split, freed again
> etc. ? i.e. the important aspect is that the same
> 'base' communicator is being used for deriving new
> communicators again and again?
>
> The reason I ask is two-fold: one, you would in that
> case be one of the ideal beneficiaries of the block cid
> algorithm :-) (even if it fails you right now);  two, a
> fix for this scenario which basically tries to reuse
> the last block used (and which would fix your case if
> the condition is true) is roughly five lines of code.
> This would give us the possibility to have a fix
> quickly in the trunk and v1.3 (keep in mind that the
> block-cid code is in the trunk since two years and this
> is the first problem that we have) and give us more
> time to develop a profound solution for the worst case
> - a chain of communicators being created, e.g.
> communicator 1 is basis to derive a new comm 2, comm 2
> is being used to derive comm 3 etc.
>
> Thanks
> Edgar
>
> David Gunter wrote:
> Here is the test code reproducer:
>
>      program test2
>      implicit none
>      include 'mpif.h'
>      integer ierr, myid,
> numprocs,i1,i2,n,local_comm,
>     $     icolor,ikey,rank,root
>
> c
> c...  MPI set-up
>      ierr = 0
>      call MPI_INIT(IERR)
>      ierr = 1
>      CALL MPI_COMM_SIZE(MPI_COMM_WORLD,
> numprocs, ierr)
>      print *, ierr
>
>      ierr = -1
>
>      CALL MPI_COMM_RANK(MPI_COMM_WORLD,
> myid, ierr)
>
>      ierr = -5
>      i1 = ierr
>      if (myid.eq.0) i1 = 1
>      call mpi_allreduce(i1, i2,
> 1,MPI_integer,MPI_MIN,
>     $     MPI_COMM_WORLD,ierr)
>
>      ikey = myid
>      if (mod(myid,2).eq.0) then
>         icolor = 0
>      else
>         icolor = MPI_UNDEFINED
>      end if
>
>      root = 0
>      do n = 1, 100000
>
>         call MPI_COMM_SPLIT(MPI_COMM_WORLD,
> icolor,
>     $        ikey, local_comm, ierr)
>
>         if (mod(myid,2).eq.0) then
>            CALL MPI_COMM_RANK(local_comm,
> rank, ierr)
>            i2 = i1
>            call mpi_reduce(i1, i2,
> 1,MPI_integer,MPI_MIN,
>     $           root, local_comm,ierr)
>
>            if
> (myid.eq.0.and.mod(n,10).eq.0)
>     $           print *, n, i1,
> i2,icolor,ikey
>
>            call mpi_comm_free(local_comm,
> ierr)
>         end if
>
>      end do
> c      if (icolor.eq.0) call
> mpi_comm_free(local_comm, ierr)
>
>
>
>      call MPI_barrier(MPi_COMM_WORLD,ierr)
>
>      call MPI_FINALIZE(IERR)
>
>      print *, myid, ierr
>
>      end
>
>
>
> -david
> --
> David Gunter
> HPC-3: Parallel Tools Team
> Los Alamos National Laboratory
>
>
>
> On Apr 30, 2009, at 12:43 PM, David Gunter
> wrote:
>
> Just to throw out more info on
> this, the test code runs fine
> on previous versions of OMPI.
>  It only hangs on the 1.3 line
> when the cid reaches 65536.
>
> -david
> --
> David Gunter
> HPC-3: Parallel Tools Team
> Los Alamos National Laboratory
>
>
>
> On Apr 30, 2009, at 12:28 PM,
> Edgar Gabriel wrote:
>
> cid's are in fact
> not recycled in the
> block algorithm.
> The problem is that
> comm_free is not
> collective, so you
> can not make any
> assumptions whether
> other procs have
> also released that
> communicator.
>
>
> But nevertheless, a
> cid in the
> communicator
> structure is a
> uint32_t, so it
> should not hit the
> 16k limit there
> yet. this is not
> new, so if there is
> a discrepancy
> between what the
> comm structure
> assumes that a cid
> is and what the pml
> assumes, than this
> was in the code
> since the very
> first days of Open
> MPI...
>
> Thanks
> Edgar
>
> Brian W. Barrett
> wrote:
> On Thu,
> 30 Apr
> 2009,
> Ralph
> Castain
> wrote:
> We
> seem
> to
> have
> hit
> a
> problem
> here
> -
> it
> looks
> like
> we
> are
> seeing
> a
> built-in
> limit
> on
> the
> number
> of
> communicators
> one
> can
> create
> in
> a
> program.
> The
> program
> basically
> does
> a
> loop,
> calling
> MPI_Comm_split
> each
> time
> through
> the
> loop
> to
> create
> a
> sub-communicator,
> does
> a
> reduce
> operation
> on
> the
> members
> of
> the
> sub-communicator,
> and
> then
> calls
> MPI_Comm_free
> to
> release
> it
> (this
> is
> a
> minimized
> reproducer
> for
> the
> real
> code).
> After
> 64k
> times
> through
> the
> loop,
> the
> program
> fails.
>
> This
> looks
> remarkably
> like
> a
> 16-bit
> index
> that
> hits
> a
> max
> value
> and
> then
> blocks.
>
> I
> have
> looked
> at
> the
> communicator
> code,
> but
> I
> don't
> immediately
> see
> such
> a
> field.
> Is
> anyone
> aware
> of
> some
> other
> place
> where
> we
> would
> have
> a
> limit
> that
> would
> cause
> this
> problem?
>
> There's
> a
> maximum
> of
> 32768
> communicator
> ids
> when
> using
> OB1
> (each
> PML can
> set the
> max
> contextid,
> although
> the
> communicator
> code is
> the
> part
> that
> actually
> assigns
> a cid).
>  Assuming
> that
> comm_free
> is
> actually
> properly
> called,
> there
> should
> be
> plenty
> of cids
> available
> for
> that
> pattern.
> However,
> I'm not
> sure I
> understand
> the
> block
> algorithm
> someone
> added
> to cid
> allocation
> - I'd
> have to
> guess
> that
> there's
> something
> funny
> with
> that
> routine
> and
> cids
> aren't
> being
> recycled
> properly.
> Brian
> _______________________________________________
> devel
> mailing
> list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> --
> Edgar Gabriel
> Assistant Professor
> Parallel Software
> Technologies Lab  
>  
>  http://pstl.cs.uh.edu
> Department of
> Computer Science  
>        University
> of Houston
> Philip G. Hoffman
> Hall, Room 524    
>    Houston,
> TX-77204, USA
> Tel: +1 (713)
> 743-3857          
>        Fax: +1
> (713) 743-3335
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> --
> Edgar Gabriel
> Assistant Professor
> Parallel Software Technologies Lab    
>  http://pstl.cs.uh.edu
> Department of Computer Science          University of
> Houston
> Philip G. Hoffman Hall, Room 524        Houston,
> TX-77204, USA
> Tel: +1 (713) 743-3857                  Fax: +1 (713)
> 743-3335
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
>
>
>