There are a few reasons why this might be occurring. Did you build with the '--enable-ft-thread' option?
If so, it looks like I didn't move over the thread_sleep_wait adjustment from the trunk - the thread was being a bit too aggressive. Try adding the following to your command line options, and see if it changes the performance.
"-mca opal_cr_thread_sleep_wait 1000"
There are other places to look as well depending on how frequently your application communicates, how often you checkpoint, process layout, ... But usually the aggressive nature of the thread is the main problem.
Let me know if that helps.
On Feb 8, 2011, at 2:50 AM, Nguyen Toan wrote:
> Hi all,
> I am using the latest version of OpenMPI (1.5.1) and BLCR (0.8.2).
> I found that when running an application,which uses MPI_Isend, MPI_Irecv and MPI_Wait,
> enabling C/R, i.e using "-am ft-enable-cr", the application runtime is much longer than the normal execution with mpirun (no checkpoint was taken).
> This overhead becomes larger when the normal execution runtime is longer.
> Does anybody have any idea about this overhead, and how to eliminate it?
> users mailing list
Postdoctoral Research Associate
Oak Ridge National Laboratory