Title: Optimizing a Conjugate Gradient Solver with Non Blocking Collective Operations


T. Hoefler, P. Gottschling, W. Rehm, A. Lumsdaine


This paper presents a case study about the applicability and usage of non blocking collective operations. These operations provide the ability to overlap communication with computation and to avoid unnecessary synchronization. We introduce our NBC library, a portable low-overhead implementation of non blocking collectives on top of MPI-1. We demonstrate the easy usage of the NBC library with the optimization of a conjugate gradient solver with only minor changes to the traditional parallel implementation of the program. The optimized solver runs up to 34% faster and is able to overlap most of the communication. We show that there is, due to the overlap, no performance difference between Gigabit Ethernet and InfiniBand for our calculation.

Presented: Euro PVM/MPI 2006, September, 2006, in Bonn, Germany.


euro-pvmmpi-2006-libnbc.pdf (PDF)

