Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] dead lock in MPI_Finalize
From: Bernard Secher - SFME/LGLS (bernard.secher_at_[hidden])
Date: 2009-01-26 03:14:15


Hello George,

Thanks for your messages. Yes i disconnect my different worlds before
calling MPI_Finalize().

Bernard

George Bosilca a écrit :
> I was somehow confused when I wrote my last email and I mixed up the
> MPI versions (thanks to Dick Treumann for gently pointing me to the
> truth). Before MPI 2.1, the MPI Standard was unclear how the
> MPI_Finalize should behave in the context of spawned or joined worlds,
> which make the disconnect+finalize the only safe and portable way to
> correctly finalize all processes connected. However, the MPI 2.1 had
> clarified this point, and now MPI_Finalize is collective over all
> connected processes (for a definition of connected processes please
> see the MPI 2.1 10.5 page 318).
>
> However, if you really want to write a portable MPI application, I
> suggest to use the disconnect+finalize, at least until all MPI
> libraries available are 2.1 compliant.
>
> Open MPI 1.3 version was supposed to be 2.1 compliant, so I guess I'll
> have to create a new bug report for this.
>
> Thanks,
> george.
>
> On Jan 23, 2009, at 10:02 , George Bosilca wrote:
>
>> I don't know what your program is doing but I kind of guess what the =
>>
>> problem is. If you use MPI 2 dynamics to spawn or connect two =
>>
>> MPI_COMM_WORLD you have to disconnect them before calling =
>>
>> MPI_Finalize. The reason is that an MPI_Finalize do the opposite of
>> an =
>>
>> MPI_Init, so it is MPI_COMM_WORLD based. Make sure your different =
>>
>> world are disconnected before doing the MPI_Finalize should solve the =
>>
>> problem.
>>
>> george.
>>
>> On Jan 23, 2009, at 06:00 , Bernard Secher - SFME/LGLS wrote:
>>
>>> No i didn't run this program whith Open-MPI 1.2.X because one said =
>>
>>> to me there were many changes between 1.2.X version and 1.3 version =
>>
>>> about MPI_publish_name, MPI_Lookup_name (new ompi-server, ...), and =
>>
>>> it was better to use 1.3 version.
>>>
>>> Yes i am sure all processes reach MPI_Finalize() function because i =
>>
>>> write message just before (it is the END_OF macro in my program), =
>>
>>> and i am sure all processes are locked in MPI_Finalize() function =
>>
>>> beacause i write message just after (it is the MESSAGE macro).
>>>
>>> May be all MPI_Sends are not matched by corresponding MPI_Recvs,... =
>>
>>> It can be a possibility.
>>>
>>> Thanks
>>> Bernard
>>>
>>>
>>>
>>> jody a =E9crit :
>>>> Hi Bernard
>>>>
>>>> The structure looks as far as i can see.
>>>> Did it run OK on Open-MPI 1.2.X?
>>>> So are you sure all processes reach the MPI_Finalize command?
>>>> Usually MPI_Finalize only completes when all processes reach it.
>>>> I think you should also make sure that all MPI_Sends are matched by
>>>> corresponding MPI_Recvs.
>>>>
>>>> Jody
>>>>
>>>> On Fri, Jan 23, 2009 at 11:08 AM, Bernard Secher - SFME/LGLS
>>>> <bernard.secher_at_[hidden]> wrote:
>>>>
>>>>> Thanks Jody for your answer.
>>>>>
>>>>> I launch 2 instances of my program on 2 processes each instance, =
>>
>>>>> on the same
>>>>> machine.
>>>>> I use MPI_Publish_name, MPI_Lookup_name to create a global =
>>
>>>>> communicator on
>>>>> the 4 processes.
>>>>> Then the 4 processes exchange data.
>>>>>
>>>>> The main program is a CORBA server. I send you this program.
>>>>>
>>>>> Bernard
>>>>>
>>>>> jody a =E9crit :
>>>>>
>>>>> For instance:
>>>>> - how many processes on how many machines,
>>>>> - what kind of computation
>>>>> - perhaps minimal code which reproduces this failing
>>>>> - configuration settings, etc.
>>>>> See: http://www.open-mpi.org/community/help/
>>>>>
>>>>> Without any information except for "it doesn't work",
>>>>> nobody can give you any help whatsoever.
>>>>>
>>>>> Jody
>>>>>
>>>>> On Fri, Jan 23, 2009 at 9:33 AM, Bernard Secher - SFME/LGLS
>>>>> <bernard.secher_at_[hidden]> wrote:
>>>>>
>>>>>
>>>>> Hello Jeff,
>>>>>
>>>>> I don't understand what you mean by "A _detailed_ description of =
>>
>>>>> what is
>>>>> failing".
>>>>> The problem is a dead lock in MPI_Finalize() function. All =
>>
>>>>> processes are
>>>>> blocked in this MPI_Finalize() function.
>>>>>
>>>>> Bernard
>>>>>
>>>>> Jeff Squyres a =E9crit :
>>>>>
>>>>>
>>>>> Per this note on the "getting help" page, we still need the =
>>
>>>>> following:
>>>>>
>>>>> "A _detailed_ description of what is failing. The more details =
>>
>>>>> that you
>>>>> provide, the better. E-mails saying "My application doesn't work!" =
>>
>>>>> will
>>>>> inevitably be answered with requests for more information about =
>>
>>>>> exactly what
>>>>> doesn't work; so please include as much information detailed in =
>>
>>>>> your initial
>>>>> e-mail as possible."
>>>>>
>>>>> Additionally:
>>>>>
>>>>> "The best way to get help is to provide a "recipie" for =
>>
>>>>> reproducing the
>>>>> problem."
>>>>>
>>>>> Thanks!
>>>>>
>>>>>
>>>>> On Jan 22, 2009, at 8:53 AM, Bernard Secher - SFME/LGLS wrote:
>>>>>
>>>>>
>>>>>
>>>>> Hello Tim,
>>>>>
>>>>> I send you the information in join files.
>>>>>
>>>>> Bernard
>>>>>
>>>>> Tim Mattox a =E9crit :
>>>>>
>>>>>
>>>>> Can you send all the information listed here:
>>>>>
>>>>> http://www.open-mpi.org/community/help/
>>>>>
>>>>> On Wed, Jan 21, 2009 at 8:58 AM, Bernard Secher - SFME/LGLS
>>>>> <bernard.secher_at_[hidden]> wrote:
>>>>>
>>>>>
>>>>>
>>>>> Hello,
>>>>>
>>>>> I have a case wher i have a dead lock in MPI_Finalize() function =
>>
>>>>> with
>>>>> openMPI v1.3.
>>>>>
>>>>> Can some body help me please?
>>>>>
>>>>> Bernard
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> _\\|//_
>>>>> (' 0 0 ')
>>>>> ____ooO (_) =
>>
>>>>> Ooo______________________________________________________
>>>>> Bernard S=E9cher DEN/DM2S/SFME/LGLS mailto : bsecher_at_[hidden]
>>>>> CEA Saclay, B=E2t 454, Pi=E8ce 114 Phone : 33 (0)1 69 08
>>>>> 73 78
>>>>> 91191 Gif-sur-Yvette Cedex, France Fax : 33 (0)1 69 08 10 87
>>>>> ------------Oooo---------------------------------------------------
>>>>> oooO ( )
>>>>> ( ) ) /
>>>>> \ ( (_/
>>>>> \_)
>>>>>
>>>>>
>>>>> Ce message =E9lectronique et tous les fichiers attach=E9s qu'il
>>>>> contient
>>>>> sont confidentiels et destin=E9s exclusivement =E0 l'usage de la =
>>
>>>>> personne
>>>>> =E0 laquelle ils sont adress=E9s. Si vous avez re=E7u ce message
>>>>> par =
>>
>>>>> erreur,
>>>>> merci d'en avertir imm=E9diatement son =E9metteur et de ne pas en =
>>
>>>>> conserver
>>>>> de copie.
>>>>>
>>>>> This e-mail and any files transmitted with it are confidential and
>>>>> intended solely for the use of the individual to whom they are =
>>
>>>>> addressed.
>>>>> If you have received this e-mail in error please inform the sender
>>>>> immediately, without keeping any copy thereof.
>>>>>
>>>>>
>>>>> < =
>>
>>>>> config =
>>
>>>>> .log =
>>
>>>>> .tgz =
>>
>>>>>> =
>>
>>>>> < =
>>
>>>>> ifconfig =
>>
>>>>> .log =
>>
>>>>> .tgz =
>>
>>>>>> <ompi_info.log.tgz>_______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> _\\|//_
>>>>> (' 0 0 ')
>>>>> ____ooO (_) =
>>
>>>>> Ooo______________________________________________________
>>>>> Bernard S=E9cher DEN/DM2S/SFME/LGLS mailto : bsecher_at_[hidden]
>>>>> CEA Saclay, B=E2t 454, Pi=E8ce 114 Phone : 33 (0)1 69 08
>>>>> 73 78
>>>>> 91191 Gif-sur-Yvette Cedex, France Fax : 33 (0)1 69 08 10 87
>>>>> ------------Oooo---------------------------------------------------
>>>>> oooO ( )
>>>>> ( ) ) /
>>>>> \ ( (_/
>>>>> \_)
>>>>>
>>>>>
>>>>> Ce message =E9lectronique et tous les fichiers attach=E9s qu'il
>>>>> contient
>>>>> sont confidentiels et destin=E9s exclusivement =E0 l'usage de la =
>>
>>>>> personne
>>>>> =E0 laquelle ils sont adress=E9s. Si vous avez re=E7u ce message
>>>>> par =
>>
>>>>> erreur,
>>>>> merci d'en avertir imm=E9diatement son =E9metteur et de ne pas en =
>>
>>>>> conserver
>>>>> de copie.
>>>>>
>>>>> This e-mail and any files transmitted with it are confidential and
>>>>> intended solely for the use of the individual to whom they are =
>>
>>>>> addressed.
>>>>> If you have received this e-mail in error please inform the sender
>>>>> immediately, without keeping any copy thereof.
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
       _\\|//_
      (' 0 0 ')
____ooO  (_) Ooo______________________________________________________
 Bernard Sécher  DEN/DM2S/SFME/LGLS    mailto : bsecher_at_[hidden]
 CEA Saclay, Bât 454, Pièce 114        Phone  : 33 (0)1 69 08 73 78
 91191 Gif-sur-Yvette Cedex, France    Fax    : 33 (0)1 69 08 10 87
------------Oooo---------------------------------------------------
       oooO (   )
       (   ) ) /
        \ ( (_/
         \_)
Ce message électronique et tous les fichiers attachés qu'il contient
sont confidentiels et destinés exclusivement à l'usage de la personne
à laquelle ils sont adressés. Si vous avez reçu ce message par erreur,
merci d'en avertir immédiatement son émetteur et de ne pas en conserver
de copie.
This e-mail and any files transmitted with it are confidential and
intended solely for the use of the individual to whom they are addressed.
If you have received this e-mail in error please inform the sender
immediately, without keeping any copy thereof.