We seem to have hit a problem here - it looks like we are seeing a built-in
limit on the number of communicators one can create in a program. The
program basically does a loop, calling MPI_Comm_split each time through the
loop to create a sub-communicator, does a reduce operation on the members of
the sub-communicator, and then calls MPI_Comm_free to release it (this is a
minimized reproducer for the real code). After 64k times through the loop,
the program fails.
This looks remarkably like a 16-bit index that hits a max value and then
I have looked at the communicator code, but I don't immediately see such a
field. Is anyone aware of some other place where we would have a limit that
would cause this problem?