Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Unknown overhead in "mpirun -am ft-enable-cr"
From: Nguyen Toan (nguyentoan1508_at_[hidden])
Date: 2011-02-25 13:31:01


Dear Josh,

Did you find out the problem? I still cannot progress anything.
Hope to hear some good news from you.

Regards,
Nguyen Toan

On Sun, Feb 13, 2011 at 3:04 PM, Nguyen Toan <nguyentoan1508_at_[hidden]>wrote:

> Hi Josh,
>
> I tried the MCA parameter you mentioned but it did not help, the unknown
> overhead still exists.
> Here I attach the output of 'ompi_info', both version 1.5 and 1.5.1.
> Hope you can find out the problem.
> Thank you.
>
> Regards,
> Nguyen Toan
>
> On Wed, Feb 9, 2011 at 11:08 PM, Joshua Hursey <jjhursey_at_[hidden]>wrote:
>
>> It looks like the logic in the configure script is turning on the FT
>> thread for you when you specify both '--with-ft=cr' and
>> '--enable-mpi-threads'.
>>
>> Can you send me the output of 'ompi_info'? Can you also try the MCA
>> parameter that I mentioned earlier to see if that changes the performance?
>>
>> I there are many non-blocking sends and receives, there might be
>> performance bug with the way the point-to-point wrapper is tracking request
>> objects. If the above MCA parameter does not help the situation, let me know
>> and I might be able to take a look at this next week.
>>
>> Thanks,
>> Josh
>>
>> On Feb 9, 2011, at 1:40 AM, Nguyen Toan wrote:
>>
>> > Hi Josh,
>> > Thanks for the reply. I did not use the '--enable-ft-thread' option.
>> Here is my build options:
>> >
>> > CFLAGS=-g \
>> > ./configure \
>> > --with-ft=cr \
>> > --enable-mpi-threads \
>> > --with-blcr=/home/nguyen/opt/blcr \
>> > --with-blcr-libdir=/home/nguyen/opt/blcr/lib \
>> > --prefix=/home/nguyen/opt/openmpi \
>> > --with-openib \
>> > --enable-mpirun-prefix-by-default
>> >
>> > My application requires lots of communication in every loop, focusing on
>> MPI_Isend, MPI_Irecv and MPI_Wait. Also I want to make only one checkpoint
>> per application execution for my purpose, but the unknown overhead exists
>> even when no checkpoint was taken.
>> >
>> > Do you have any other idea?
>> >
>> > Regards,
>> > Nguyen Toan
>> >
>> >
>> > On Wed, Feb 9, 2011 at 12:41 AM, Joshua Hursey <jjhursey_at_[hidden]>
>> wrote:
>> > There are a few reasons why this might be occurring. Did you build with
>> the '--enable-ft-thread' option?
>> >
>> > If so, it looks like I didn't move over the thread_sleep_wait adjustment
>> from the trunk - the thread was being a bit too aggressive. Try adding the
>> following to your command line options, and see if it changes the
>> performance.
>> > "-mca opal_cr_thread_sleep_wait 1000"
>> >
>> > There are other places to look as well depending on how frequently your
>> application communicates, how often you checkpoint, process layout, ... But
>> usually the aggressive nature of the thread is the main problem.
>> >
>> > Let me know if that helps.
>> >
>> > -- Josh
>> >
>> > On Feb 8, 2011, at 2:50 AM, Nguyen Toan wrote:
>> >
>> > > Hi all,
>> > >
>> > > I am using the latest version of OpenMPI (1.5.1) and BLCR (0.8.2).
>> > > I found that when running an application,which uses MPI_Isend,
>> MPI_Irecv and MPI_Wait,
>> > > enabling C/R, i.e using "-am ft-enable-cr", the application runtime is
>> much longer than the normal execution with mpirun (no checkpoint was taken).
>> > > This overhead becomes larger when the normal execution runtime is
>> longer.
>> > > Does anybody have any idea about this overhead, and how to eliminate
>> it?
>> > > Thanks.
>> > >
>> > > Regards,
>> > > Nguyen
>> > > _______________________________________________
>> > > users mailing list
>> > > users_at_[hidden]
>> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >
>> > ------------------------------------
>> > Joshua Hursey
>> > Postdoctoral Research Associate
>> > Oak Ridge National Laboratory
>> > http://users.nccs.gov/~jjhursey
>> >
>> >
>> > _______________________________________________
>> > users mailing list
>> > users_at_[hidden]
>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >
>> > _______________________________________________
>> > users mailing list
>> > users_at_[hidden]
>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> ------------------------------------
>> Joshua Hursey
>> Postdoctoral Research Associate
>> Oak Ridge National Laboratory
>> http://users.nccs.gov/~jjhursey
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>