Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Inherent limit on #communicators?
From: Ralph Castain (rhc_at_[hidden])
Date: 2009-04-30 14:44:25

As an FYI the code runs just fine using OMPI 1.2.x - it is only 1.3.x where
the problem arises.

So it is definitely something that changed in the 1.3 series


On Thu, Apr 30, 2009 at 12:36 PM, Brian W. Barrett <brbarret_at_[hidden]>wrote:

> When we added the CM PML, we added a pml_max_contextid field to the PML
> structure, which is the max size cid the PML can handle (because the
> matching interfaces don't allow 32 bits to be used for the cid. At the same
> time, the max cid for OB1 was shrunk significantly, so that the header on a
> short message would be packed tightly with no alignment padding.
> At the time, we believed 32k simultaneous communicators was plenty, and
> that CIDs were reused (we checked, I'm pretty sure). It sounds like someone
> removed the CID reuse code, which seems rather bad to me. There have to be
> unused CIDs in Ralph's example - is there a way to fallback out of the block
> algorithm when it can't find a new CID and find one it can reuse? Other
> than setting the multi-threaded case back on, that is?
> Brian
> On Thu, 30 Apr 2009, Edgar Gabriel wrote:
> cid's are in fact not recycled in the block algorithm. The problem is that
>> comm_free is not collective, so you can not make any assumptions whether
>> other procs have also released that communicator.
>> But nevertheless, a cid in the communicator structure is a uint32_t, so it
>> should not hit the 16k limit there yet. this is not new, so if there is a
>> discrepancy between what the comm structure assumes that a cid is and what
>> the pml assumes, than this was in the code since the very first days of Open
>> MPI...
>> Thanks
>> Edgar
>> Brian W. Barrett wrote:
>>> On Thu, 30 Apr 2009, Ralph Castain wrote:
>>> We seem to have hit a problem here - it looks like we are seeing a
>>>> built-in limit on the number of communicators one can create in a
>>>> program. The program basically does a loop, calling MPI_Comm_split each
>>>> time through the loop to create a sub-communicator, does a reduce
>>>> operation on the members of the sub-communicator, and then calls
>>>> MPI_Comm_free to release it (this is a minimized reproducer for the real
>>>> code). After 64k times through the loop, the program fails.
>>>> This looks remarkably like a 16-bit index that hits a max value and then
>>>> blocks.
>>>> I have looked at the communicator code, but I don't immediately see such
>>>> a field. Is anyone aware of some other place where we would have a limit
>>>> that would cause this problem?
>>> There's a maximum of 32768 communicator ids when using OB1 (each PML can
>>> set the max contextid, although the communicator code is the part that
>>> actually assigns a cid). Assuming that comm_free is actually properly
>>> called, there should be plenty of cids available for that pattern. However,
>>> I'm not sure I understand the block algorithm someone added to cid
>>> allocation - I'd have to guess that there's something funny with that
>>> routine and cids aren't being recycled properly.
>>> Brian
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>> _______________________________________________
> devel mailing list
> devel_at_[hidden]