Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] MPI_Reduce performance
From: Eugene Loh (eugene.loh_at_[hidden])
Date: 2010-09-09 15:32:20

Alex A. Granovsky wrote:
Isn't in evident from the theory of random processes and probability theory that in the limit of infinitely
large cluster and parallel process, the probability of deadlocks with current implementation is unfortunately
quite a finite quantity and in limit approaches to unity regardless on any particular details of the program.
No, not at all.  Consider simulating a physical volume.  Each process is assigned to some small subvolume.  It updates conditions locally, but on the surface of its simulation subvolume it needs information from "nearby" processes.  It cannot proceed along the surface until it has that neighboring information.  Its neighbors, in turn, cannot proceed until their neighbors have reached some point.  Two distant processes can be quite out of step with one another, but only by some bounded amount.  At some point, a leading process has to wait for information from a laggard to propagate to it.  All processes proceed together, in some loose lock-step fashion.  Many applications behave in this fashion.  Actually, in many applications, the synchronization is tightened in that "physics" is made to propagate faster than neighbor-by-neighbor.

As the number of processes increases, the laggard might seem relatively slower in comparison, but that isn't deadlock.

As the size of the cluster increases, the chances of a system component failure increase, but that also is a different matter.