First of all, the reason that I have created a CPU-friendly version of MPI_Barrier is that my program is asymmetric (so some of the nodes can easily have to wait for several hours) and that it is I/O bound. My program uses MPI mainly to synchronize I/O and to share some counters between the nodes, followed by a gather/scatter of the files. MPI_Barrier (or any of the other MPI calls) caused the four CPU's of my Quad Core to continuously run at 100% because of the aggressive polling, making the server almost unusable and also slowing my program down because there was less CPU time available for I/O and file synchronization. With this version of MPI_Barrier CPU usage averages out at about 25%. I only recently learned about the OMPI_MCA_mpi_yield_when_idle variable, I still have to test if that is an alternative to my workaround.
On Sun, 2009-12-13 at 19:04 +0100, Gijsbert Wiesenekker wrote:There are some proposals for Non-blocking collectives before the MPI
> The following routine gives a problem after some (not reproducible)
> time on Fedora Core 12. The routine is a CPU usage friendly version of
forum currently and I believe a working implementation which can be used
as a plug-in for OpenMPI, I would urge you to look at these rather than
try and implement your own.
Your code both does all-to-all communication and also uses probe, both
> My question is: is there a problem with this routine that I overlooked
> that somehow did not show up until now
of these can easily be avoided when implementing Barrier.
Yes, there is a message queue interface allowing tools to peek inside
> Is there a way to see which messages have been sent/received/are
the MPI library and see these queues. That I know of there are three
tools which use this, either TotalView, DDT or my own tool, padb.
TotalView and DDT are both full-featured graphical debuggers and
commercial products, padb is a open-source text based tool.
Ashley Pittman, Bath, UK.
Padb - A parallel job inspection tool for cluster computing
users mailing list