Thanks,But, I have put a mpi_waitall(request) beforecout << " I am rank " << rank << " I am before MPI_Finalize()" << endl;If the above sentence has been printed out, it means that all requests have been checked and finished. right ?What may be the possible reasons for that stuck ?Any help is appreciated.JackOct. 25 2010
Date: Mon, 25 Oct 2010 05:32:44 -0400
From: terry.dontje@oracle.com_______________________________________________ users mailing list users@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
To: users@open-mpi.org
Subject: Re: [OMPI users] Open MPI program cannot complete
So what you are saying is *all* the ranks have entered MPI_Finalize and only a subset has exited per placing prints before and after MPI_Finalize. Good. So my guess is that the processes stuck in MPI_Finalize have a prior MPI request outstanding that for whatever reason is unable to complete. So I would first look at all the MPI requests and make sure they completed.
--td
On 10/25/2010 02:38 AM, Jack Bryan wrote:thanksI found a problem:
I used:
cout << " I am rank " << rank << " I am before MPI_Finalize()" << endl;MPI_Finalize();cout << " I am rank " << rank << " I am after MPI_Finalize()" << endl;return 0;
I can get the output " I am rank 0 (1, 2, ....) I am before MPI_Finalize() ".
and" I am rank 0 I am after MPI_Finalize() "But, other processes do not printed out "I am rank ... I am after MPI_Finalize()" .
It is weird. The process has reached the point just before MPI_Finalize(), why they are hanged there ?
Are there other better ways to check this ?
Any help is appreciated.
thanks
Jack
Oct. 25 2010
From: solarbikedz@gmail.com
Date: Sun, 24 Oct 2010 19:47:54 -0700
To: users@open-mpi.org
Subject: Re: [OMPI users] Open MPI program cannot complete
how do you know all process call mpi_finalize? did you have all of them print out something before they call mpi_finalize? I think what Gustavo is getting at is maybe you had some MPI calls within your snippets that hangs your program, thus some of your processes never called mpi_finalize.
On Sun, Oct 24, 2010 at 6:59 PM, Jack Bryan <dtustudy68@hotmail.com> wrote:
Thanks,
But, my code is too long to be posted.
What are the common reasons of this kind of problems ?
Any help is appreciated.
Jack
Oct. 24 2010> Date: Sun, 24 Oct 2010 18:09:52 -0400
> To: users@open-mpi.org
> Subject: Re: [OMPI users] Open MPI program cannot complete
>
> Hi Jack
>
> Your code snippet is too terse, doesn't show the MPI calls.
> It is hard to guess what is the problem this way.
>
> Gus Correa
> On Oct 24, 2010, at 5:43 PM, Jack Bryan wrote:
>
> > Thanks for the reply.
> > But, I use mpi_waitall() to make sure that all MPI communications have been done before a process call MPI_Finalize() and returns.
> >
> > Any help is appreciated.
> >
> > thanks
> >
> > Jack
> >
> > Oct. 24 2010
> >
> > > From: gus@ldeo.columbia.edu
> > > Date: Sun, 24 Oct 2010 17:31:11 -0400
> > > To: users@open-mpi.org
> > > Subject: Re: [OMPI users] Open MPI program cannot complete
> > >
> > > Hi Jack
> > >
> > > It may depend on "do some things".
> > > Does it involve MPI communication?
> > >
> > > Also, why not put MPI_Finalize();return 0 outside the ifs?
> > >
> > > Gus Correa
> > >
> > > On Oct 24, 2010, at 2:23 PM, Jack Bryan wrote:
> > >
> > > > Hi
> > > >
> > > > I got a problem of open MPI.
> > > >
> > > > My program has 5 processes.
> > > >
> > > > All of them can run MPI_Finalize() and return 0.
> > > >
> > > > But, the whole program cannot be completed.
> > > >
> > > > In the MPI cluster job queue, it is strill in running status.
> > > >
> > > > If I use 1 process to run it, no problem.
> > > >
> > > > Why ?
> > > >
> > > > My program:
> > > >
> > > > int main (int argc, char **argv)
> > > > {
> > > >
> > > > MPI_Init(&argc, &argv);
> > > > MPI_Comm_rank(MPI_COMM_WORLD, &myRank);
> > > > MPI_Comm_size(MPI_COMM_WORLD, &mySize);
> > > > MPI_Comm world;
> > > > world = MPI_COMM_WORLD;
> > > >
> > > > if (myRank == 0)
> > > > {
> > > > do some things.
> > > > }
> > > >
> > > > if (myRank != 0)
> > > > {
> > > > do some things.
> > > > MPI_Finalize();
> > > > return 0 ;
> > > > }
> > > > if (myRank == 0)
> > > > {
> > > > MPI_Finalize();
> > > > return 0;
> > > > }
> > > >
> > > > }
> > > >
> > > > And, some output files get wrong codes, which can not be readible.
> > > > In 1-process case, the program can print correct results to these output files .
> > > >
> > > > Any help is appreciated.
> > > >
> > > > thanks
> > > >
> > > > Jack
> > > >
> > > > Oct. 24 2010
> > > >
> > > > _______________________________________________
> > > > users mailing list
> > > > users@open-mpi.org
> > > > http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >
> > >
> > > _______________________________________________
> > > users mailing list
> > > users@open-mpi.org
> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
> > _______________________________________________
> > users mailing list
> > users@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> users@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
--
David Zhang
University of California, San Diego
_______________________________________________ users mailing list users@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users _______________________________________________ users mailing list users@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle - Performance Technologies
95 Network Drive, Burlington, MA 01803
Email terry.dontje@oracle.com
_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users