Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] MPI_Bcast issue
From: Ralph Castain (rhc_at_[hidden])
Date: 2010-08-08 23:32:57


Hi Randolph

Unless your code is doing a connect/accept between the copies, there is no way they can cross-communicate. As you note, mpirun instances are completely isolated from each other - no process in one instance can possibly receive information from a process in another instance because it lacks all knowledge of it -unless- they wireup into a greater communicator by performing connect/accept calls between them.

I suspect you are inadvertently doing just that - perhaps by doing connect/accept in a tree-like manner, not realizing that the end result is one giant communicator that now links together all the N servers.

Otherwise, there is no possible way an MPI_Bcast in one mpirun can collide or otherwise communicate with an MPI_Bcast between processes started by another mpirun.

On Aug 8, 2010, at 7:13 PM, Randolph Pullen wrote:

> Thanks, although “An intercommunicator cannot be used for collective communication.” i.e , bcast calls., I can see how the MPI_Group_xx calls can be used to produce a useful group and then communicator; - thanks again but this is really the side issue to my main question about MPI_Bcast.
>
> I seem to have duplicate concurrent processes interfering with each other. This would appear to be a breach of the MPI safety dictum, ie MPI_COMM_WORD is supposed to only include the processes started by a single mpirun command and isolate these processes from other similar groups of processes safely.
>
> So, it would appear to be a bug. If so this has significant implications for environments such as mine, where it may often occur that the same program is run by different users simultaneously.
>
> It is really this issue that it concerning me, I can rewrite the code but if it can crash when 2 copies run at the same time, I have a much bigger problem.
>
> My suspicion is that a within the MPI_Bcast handshaking, a syncronising broadcast call may be colliding across the environments. My only evidence is an otherwise working program waits on broadcast reception forever when two or more copies are run at [exactly] the same time.
>
> Has anyone else seen similar behavior in concurrently running programs that perform lots of broadcasts perhaps?
>
> Randolph
>
>
> --- On Sun, 8/8/10, David Zhang <solarbikedz_at_[hidden]> wrote:
>
> From: David Zhang <solarbikedz_at_[hidden]>
> Subject: Re: [OMPI users] MPI_Bcast issue
> To: "Open MPI Users" <users_at_[hidden]>
> Received: Sunday, 8 August, 2010, 12:34 PM
>
> In particular, intercommunicators
>
> On 8/7/10, Aurélien Bouteiller <bouteill_at_[hidden]> wrote:
> > You should consider reading about communicators in MPI.
> >
> > Aurelien
> > --
> > Aurelien Bouteiller, Ph.D.
> > Innovative Computing Laboratory, The University of Tennessee.
> >
> > Envoyé de mon iPad
> >
> > Le Aug 7, 2010 à 1:05, Randolph Pullen <randolph_pullen_at_[hidden]> a
> > écrit :
> >
> >> I seem to be having a problem with MPI_Bcast.
> >> My massive I/O intensive data movement program must broadcast from n to n
> >> nodes. My problem starts because I require 2 processes per node, a sender
> >> and a receiver and I have implemented these using MPI processes rather
> >> than tackle the complexities of threads on MPI.
> >>
> >> Consequently, broadcast and calls like alltoall are not completely
> >> helpful. The dataset is huge and each node must end up with a complete
> >> copy built by the large number of contributing broadcasts from the sending
> >> nodes. Network efficiency and run time are paramount.
> >>
> >> As I don’t want to needlessly broadcast all this data to the sending nodes
> >> and I have a perfectly good MPI program that distributes globally from a
> >> single node (1 to N), I took the unusual decision to start N copies of
> >> this program by spawning the MPI system from the PVM system in an effort
> >> to get my N to N concurrent transfers.
> >>
> >> It seems that the broadcasts running on concurrent MPI environments
> >> collide and cause all but the first process to hang waiting for their
> >> broadcasts. This theory seems to be confirmed by introducing a sleep of
> >> n-1 seconds before the first MPI_Bcast call on each node, which results
> >> in the code working perfectly. (total run time 55 seconds, 3 nodes,
> >> standard TCP stack)
> >>
> >> My guess is that unlike PVM, OpenMPI implements broadcasts with broadcasts
> >> rather than multicasts. Can someone confirm this? Is this a bug?
> >>
> >> Is there any multicast or N to N broadcast where sender processes can
> >> avoid participating when they don’t need to?
> >>
> >> Thanks in advance
> >> Randolph
> >>
> >>
> >>
> >> _______________________________________________
> >> users mailing list
> >> users_at_[hidden]
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
>
> --
> Sent from my mobile device
>
> David Zhang
> University of California, San Diego
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users