Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Some questions about checkpoint/restart (16)
From: Joshua Hursey (jjhursey_at_[hidden])
Date: 2010-12-22 14:11:55


Thanks for the questions. Keep them coming. I hope to have some time after the first of the year to make some progress on some of the others. But for this one, I think you are correct. Does the attached patch (created from the Open MPI trunk r24190) fix this particular issue? If so, I'll go ahead and commit it to the trunk and ask for it to be brought over the to release series.

Thanks again,
Josh


On Dec 22, 2010, at 3:07 AM, Takayuki Seki wrote:

>
> I have a new question about Checkpoint/Restart.
>
> 16th question is as follows:
>
> (16) If a program uses MPI_Init_thread function,
> checkpoint cannot be taken by the opal_cr_thread_fn thread.
>
> Framework : ompi/mpi
> Component : c
> The source file : ompi/mpi/c/init_thread.c
> The function name : MPI_Init_thread
>
>
> Here's the code that causes the problem:
>
> #define LOOP 60
>
> MPI_Barrier(MPI_COMM_WORLD);
> if (rank == 0) {
> printf(" rank=%d 60 seconds sleeping start \n",rank); fflush(stdout);
> }
> for (i=0;i<LOOP;i++) { /* Take checkpoint while the process is in this loop. */
> sleep(1);
> if (rank == 0) {
> printf(" rank=%d loop=%d \n",rank,i); fflush(stdout);
> }
> }
> if (rank == 0) {
> printf(" rank=%d 60 seconds sleeping finished \n",rank); fflush(stdout);
> }
> MPI_Barrier(MPI_COMM_WORLD);
> if (rank == 0) {
> printf(" rank=%d executes Finalize \n",rank); fflush(stdout);
> }
> MPI_Finalize();
>
>
> * This problem can be confirmed even by execution by one process.
>
> mpiexec -n 1 .... ./a.out
>
> * Take checkpoint while the process is in the loop to which it takes 60 seconds.
>
> * Example of restart result of a program using MPI_Init.
>
> -bash-3.2$ ompi-restart ompi_global_snapshot_20762.ckpt
> rank=0 loop=42
> rank=0 loop=43
> rank=0 loop=44
> rank=0 loop=45
> rank=0 loop=46
> rank=0 loop=47
> rank=0 loop=48
> rank=0 loop=49
> rank=0 loop=50
> rank=0 loop=51
> rank=0 loop=52
> rank=0 loop=53
> rank=0 loop=54
> rank=0 loop=55
> rank=0 loop=56
> rank=0 loop=57
> rank=0 loop=58
> rank=0 loop=59
> rank=0 60 seconds sleeping finished
> rank=0 executes Finalize
> rank=0 program end
>
> Because checkpoint was taken by opal_cr_thread_fn function immediately
> when the checkpoint operation was executed,
> the program restarts from the loop.
>
> * Example of restart result of a program using MPI_Init_thread.
>
> -bash-3.2$ ompi-restart ompi_global_snapshot_20660.ckpt
> rank=0 executes Finalize
> rank=0 program end
>
> It is in the MPI_Barrier function after the loop
> that checkpoint was actually taken.
> Therefore, the program restarts from MPI_Barrier function.
>
>
> * I think that it is the problem that MPI_Init_thread does not execute OPAL_CR_INIT_LIBRARY.
> So, opal_cr_thread_is_active still remains in false condition.
> Therefore, the following while loop does not terminate.
>
> /*
> * Wait to become active
> */
> while( !opal_cr_thread_is_active && !opal_cr_thread_is_done) {
> sched_yield();
> }
>
>
> * MPI_Init_thread uses OPAL_CR_ENTER_LIBRARY and OPAL_CR_EXIT_LIBRARY.
> I think it is not correct.
> Because MPI_Init_thread is an initialization function of MPI,
> I think that it should be the same specification as MPI_Init.
>
>
> -bash-3.2$ cat t_mpi_question-16.c
> #include <stdio.h>
> #include <stdlib.h>
> #include <unistd.h>
> #include "mpi.h"
>
> #define LOOP 60
>
> int main(int ac,char **av)
> {
> int i;
> int rank,size;
> int required,provided,provided_for_query;
>
> required = MPI_THREAD_SINGLE;
> provided = -1;
> provided_for_query = -1;
> #if defined(USE_INITTHREAD)
> MPI_Init_thread(&ac,&av,required,&provided);
> MPI_Query_thread(&provided_for_query);
> #else
> MPI_Init(&ac,&av);
> #endif
> MPI_Comm_rank(MPI_COMM_WORLD,&rank);
> MPI_Comm_size(MPI_COMM_WORLD,&size);
> if (rank == 0) {
> printf(" rank=%d sz=%d required=%d provided=%d provided_for_query=%d \n"
> ,rank,size,required,provided,provided_for_query); fflush(stdout);
> }
>
> MPI_Barrier(MPI_COMM_WORLD);
>
> if (rank == 0) {
> printf(" rank=%d 60 seconds sleeping start \n",rank); fflush(stdout);
> }
> for (i=0;i<LOOP;i++) {
> sleep(1);
> if (rank == 0) {
> printf(" rank=%d loop=%d \n",rank,i); fflush(stdout);
> }
> }
> if (rank == 0) {
> printf(" rank=%d 60 seconds sleeping finished \n",rank); fflush(stdout);
> }
>
> MPI_Barrier(MPI_COMM_WORLD);
> if (rank == 0) {
> printf(" rank=%d executes Finalize \n",rank); fflush(stdout);
> }
> MPI_Finalize();
> if (rank == 0) {
> printf(" rank=%d program end \n",rank); fflush(stdout);
> }
> return(0);
> }
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

------------------------------------
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey