Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI devel] Comm_spawn limits
From: Andreas Schäfer (gentryx_at_[hidden])
Date: 2008-10-27 17:52:29


I don't know any implementation details, but is making a 16-bit
counter a 32-bit counter really so much harder than this fancy
(overengineered? ;-) ) table construction? The way I see it, this
table which might become a real mess if there are multiple
MPI_Comm_spawn issued simultaneously in different communicators. (Would
that be legal MPI?)

Anyway, just my $0.01 (we don't get so many dollars for our euros
anymore...)
-Andreas

On 17:02 Mon 27 Oct , Jeff Squyres wrote:
> How about a variation on that idea: keep a global bitmap or some other kind
> of "this ID is in use" table. Hence, if the launch counter rolls over, you
> can simply check the table to find a free value. That way, you can be sure
> to never re-use a value that is still being used.
>
> So we'd have 16 bits to express this counter, but we could introduce a
> limit of how many concurrent spawns we support. Hence, the IDs can be
> large, but we only allow having N distinct values at any one time (quite
> similar to PIDs and an OS process table). We can specify the value of N
> via configure, an MCA parameter, ...whatever. If the MPI job tries to have
> more than N concurrent spawned jobs, it's an error. But for a job that
> continuously spawns jobs that each die off in short finite time, it'll be
> no problem. The counter will likely cycle around, but won't run into any
> problems as long as there are <N total spawns still running.
>
> <waving hands a bit> There's probably some off-by-one errors in the above
> paragraph, but you get the idea. :-)
>
>
> On Oct 22, 2008, at 2:59 PM, Ralph Castain wrote:
>
>> I can't swear to this because I haven't fully grokked it yet, but I
>> believe the answer is:
>>
>> 1. if child jobs have completed, it won't hurt. I think the various
>> subsystem cleanup their bookkeeping when a job completes, so we could
>> possibly reuse the number. Might be some race conditions we would have
>> to resolve.
>>
>> 2. if child jobs haven't completed (which is the situation this
>> particular user was attempting), then we would have a problem with
>> jobid confusion. Once we get the procs launched, though, I'm not sure
>> how much of a problem there is - would have to investigate. Could
>> cause some bookkeeping problems for job completion.
>>
>> Interesting possibility, though...consider it another option for now.
>>
>>
>>
>> On Oct 22, 2008, at 12:53 PM, George Bosilca wrote:
>>
>> > What's happened if we roll around with the counter ?
>> >
>> > george.
>> >
>> > On Oct 22, 2008, at 2:49 PM, Ralph Castain wrote:
>> >
>> >> There recently was activity on the mailing lists where someone was
>> >> attempting to call comm_spawn 100,000 times. Setting aside the
>> >> threading issues that were the focus of that exchange, the fact is
>> >> that OMPI currently cannot handle that many comm_spawns.
>> >>
>> >> The ORTE jobid is composed of two elements:
>> >>
>> >> 1. the top 16-bits is an "identifier" for that mpirun
>> >>
>> >> 2. the lower 16-bits is a running counter identifying the specific
>> >> job/launch for those procs.
>> >>
>> >> Thus, we are limited to 64k comm_spawns.
>> >>
>> >> Expanding this would require either revamping the entire way we
>> >> handle jobs (e.g., removing the mpirun identifier - major effort),
>> >> or expanding the orte_jobid_t from its current 32-bits to 64-bits.
>> >>
>> >> Is this a problem we want to address?
>> >> Ralph
>> >>
>> >> _______________________________________________
>> >> devel mailing list
>> >> devel_at_[hidden]
>> >> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> >
>> > _______________________________________________
>> > devel mailing list
>> > devel_at_[hidden]
>> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> --
> Jeff Squyres
> Cisco Systems
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

-- 
============================================
Andreas Schäfer
Cluster and Metacomputing Working Group
Friedrich-Schiller-Universität Jena, Germany
0049/3641-9-46376
PGP/GPG key via keyserver
I'm a bright... http://www.the-brights.net
============================================
(\___/)
(+'.'+)
(")_(")
This is Bunny. Copy and paste Bunny into your 
signature to help him gain world domination!


  • application/pgp-signature attachment: stored