|Thanks, although “An intercommunicator cannot be used for collective communication.” i.e , bcast calls., I can see how the MPI_Group_xx calls can be used to produce a useful group and then communicator; - thanks again but this is really the side issue to my main question about MPI_Bcast.|
I seem to have duplicate concurrent processes interfering with each other. This would appear to be a breach of the MPI safety dictum, ie MPI_COMM_WORD is supposed to only include the processes started by a single mpirun command and isolate these processes from other similar groups of processes safely.
So, it would appear to be a bug. If so this has significant implications for environments such as mine, where it may often occur that the same program is run by different users simultaneously.
It is really this issue that it concerning me, I can rewrite the code but if it can crash when 2 copies run at the same time, I have a much bigger problem.
My suspicion is that a within the MPI_Bcast handshaking, a syncronising broadcast call may be colliding across the environments. My only evidence is an otherwise working program waits on broadcast reception forever when two or more copies are run at [exactly] the same time.
Has anyone else seen similar behavior in concurrently running programs that perform lots of broadcasts perhaps?
--- On Sun, 8/8/10, David Zhang <email@example.com> wrote: