Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Comm_spawn limits
From: Andreas Schäfer (gentryx_at_[hidden])
Date: 2008-10-27 17:52:29


I don't know any implementation details, but is making a 16-bit
counter a 32-bit counter really so much harder than this fancy
(overengineered? ;-) ) table construction? The way I see it, this
table which might become a real mess if there are multiple
MPI_Comm_spawn issued simultaneously in different communicators. (Would
that be legal MPI?)

Anyway, just my $0.01 (we don't get so many dollars for our euros
anymore...)
-Andreas

On 17:02 Mon 27 Oct , Jeff Squyres wrote:
> How about a variation on that idea: keep a global bitmap or some other kind
> of "this ID is in use" table. Hence, if the launch counter rolls over, you
> can simply check the table to find a free value. That way, you can be sure
> to never re-use a value that is still being used.
>
> So we'd have 16 bits to express this counter, but we could introduce a
> limit of how many concurrent spawns we support. Hence, the IDs can be
> large, but we only allow having N distinct values at any one time (quite
> similar to PIDs and an OS process table). We can specify the value of N
> via configure, an MCA parameter, ...whatever. If the MPI job tries to have
> more than N concurrent spawned jobs, it's an error. But for a job that
> continuously spawns jobs that each die off in short finite time, it'll be
> no problem. The counter will likely cycle around, but won't run into any
> problems as long as there are <N total spawns still running.
>
> <waving hands a bit> There's probably some off-by-one errors in the above
> paragraph, but you get the idea. :-)
>
>
> On Oct 22, 2008, at 2:59 PM, Ralph Castain wrote:
>
>> I can't swear to this because I haven't fully grokked it yet, but I
>> believe the answer is:
>>
>> 1. if child jobs have completed, it won't hurt. I think the various
>> subsystem cleanup their bookkeeping when a job completes, so we could
>> possibly reuse the number. Might be some race conditions we would have
>> to resolve.
>>
>> 2. if child jobs haven't completed (which is the situation this
>> particular user was attempting), then we would have a problem with
>> jobid confusion. Once we get the procs launched, though, I'm not sure
>> how much of a problem there is - would have to investigate. Could
>> cause some bookkeeping problems for job completion.
>>
>> Interesting possibility, though...consider it another option for now.
>>
>>
>>
>> On Oct 22, 2008, at 12:53 PM, George Bosilca wrote:
>>
>> > What's happened if we roll around with the counter ?
>> >
>> > george.
>> >
>> > On Oct 22, 2008, at 2:49 PM, Ralph Castain wrote:
>> >
>> >> There recently was activity on the mailing lists where someone was
>> >> attempting to call comm_spawn 100,000 times. Setting aside the
>> >> threading issues that were the focus of that exchange, the fact is
>> >> that OMPI currently cannot handle that many comm_spawns.
>> >>
>> >> The ORTE jobid is composed of two elements:
>> >>
>> >> 1. the top 16-bits is an "identifier" for that mpirun
>> >>
>> >> 2. the lower 16-bits is a running counter identifying the specific
>> >> job/launch for those procs.
>> >>
>> >> Thus, we are limited to 64k comm_spawns.
>> >>
>> >> Expanding this would require either revamping the entire way we
>> >> handle jobs (e.g., removing the mpirun identifier - major effort),
>> >> or expanding the orte_jobid_t from its current 32-bits to 64-bits.
>> >>
>> >> Is this a problem we want to address?
>> >> Ralph
>> >>
>> >> _______________________________________________
>> >> devel mailing list
>> >> devel_at_[hidden]
>> >> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> >
>> > _______________________________________________
>> > devel mailing list
>> > devel_at_[hidden]
>> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> --
> Jeff Squyres
> Cisco Systems
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

-- 
============================================
Andreas Schäfer
Cluster and Metacomputing Working Group
Friedrich-Schiller-Universität Jena, Germany
0049/3641-9-46376
PGP/GPG key via keyserver
I'm a bright... http://www.the-brights.net
============================================
(\___/)
(+'.'+)
(")_(")
This is Bunny. Copy and paste Bunny into your 
signature to help him gain world domination!


  • application/pgp-signature attachment: stored