Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Unknown overhead in "mpirun -am ft-enable-cr"
From: Nguyen Toan (nguyentoan1508_at_[hidden])
Date: 2011-02-09 01:40:32


Hi Josh,
Thanks for the reply. I did not use the '--enable-ft-thread' option. Here is
my build options:

CFLAGS=-g \
./configure \
--with-ft=cr \
--enable-mpi-threads \
--with-blcr=/home/nguyen/opt/blcr \
--with-blcr-libdir=/home/nguyen/opt/blcr/lib \
--prefix=/home/nguyen/opt/openmpi \
--with-openib \
--enable-mpirun-prefix-by-default

My application requires lots of communication in every loop, focusing on
MPI_Isend, MPI_Irecv and MPI_Wait. Also I want to make only one checkpoint
per application execution for my purpose, but the unknown overhead exists
even when no checkpoint was taken.

Do you have any other idea?

Regards,
Nguyen Toan

On Wed, Feb 9, 2011 at 12:41 AM, Joshua Hursey <jjhursey_at_[hidden]>wrote:

> There are a few reasons why this might be occurring. Did you build with the
> '--enable-ft-thread' option?
>
> If so, it looks like I didn't move over the thread_sleep_wait adjustment
> from the trunk - the thread was being a bit too aggressive. Try adding the
> following to your command line options, and see if it changes the
> performance.
> "-mca opal_cr_thread_sleep_wait 1000"
>
> There are other places to look as well depending on how frequently your
> application communicates, how often you checkpoint, process layout, ... But
> usually the aggressive nature of the thread is the main problem.
>
> Let me know if that helps.
>
> -- Josh
>
> On Feb 8, 2011, at 2:50 AM, Nguyen Toan wrote:
>
> > Hi all,
> >
> > I am using the latest version of OpenMPI (1.5.1) and BLCR (0.8.2).
> > I found that when running an application,which uses MPI_Isend, MPI_Irecv
> and MPI_Wait,
> > enabling C/R, i.e using "-am ft-enable-cr", the application runtime is
> much longer than the normal execution with mpirun (no checkpoint was taken).
> > This overhead becomes larger when the normal execution runtime is longer.
> > Does anybody have any idea about this overhead, and how to eliminate it?
> > Thanks.
> >
> > Regards,
> > Nguyen
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> ------------------------------------
> Joshua Hursey
> Postdoctoral Research Associate
> Oak Ridge National Laboratory
> http://users.nccs.gov/~jjhursey
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>