Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Problem runing MPI on cluster (mpi4py)
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2012-09-26 02:35:40


The mpi4py web site appears to be down right now, so I can't check, but don't you need to call MPI_Finalize somehow?

Maybe you need to explicitly close the MPI module (which then implicitly calls MPI_Finalize)? I'm afraid I don't know much about mpi4py, so I can't offer specific advice.

That being said, I also notice that all your outputs say "I am process 0 of 1". This might have been mentioned already, but this typically means you've got a mismatch between the Open MPI version that you compiled mpi4py with and the mpirun that you used to launch it. You may even have compiled mpi4py against MPICH, but used Open MPI's mpirun to launch it. That could also lead to side effects like saying that your program exited incorrectly.

To be absolutely clear: you need to use the exact same version of Open MPI to both compile mpi4py and mpirun your python program.

On Sep 26, 2012, at 6:18 AM, Ralph Castain wrote:

> Well, not sure what I can advise. Check to ensure that your LD_LIBRARY_PATH is pointing to the same installation where your mpirun is located. For whatever reason, the processes think they are singletons - i.e., that they were not actually started by mpirun.
>
> You might also want to ask the mpi4py folks - we aren't very familiar with that package over here. It could be that you need to configure it for OpenMPI as opposed to mpich.
>
>
> On Tue, Sep 25, 2012 at 7:08 PM, Mariana Vargas Magana <mmarianav_at_[hidden]> wrote:
>
> Yes I am sure I read from a mpi4py guide I already check the examples if fact this an example extracted from a guide…!! Evenmore this example if I use with mpich2 it runs very nicely, even though for the other code I need openmpi working =s
>
> Mariana
>
>
>
> On Sep 25, 2012, at 8:00 PM, Ralph Castain <rhc.openmpi_at_[hidden]> wrote:
>
>> I don't think that is true, but I suggest you check the mpi4py examples. I believe all import does is import function definitions - it doesn't execute anything.
>>
>> Sent from my iPad
>>
>> On Sep 25, 2012, at 2:41 PM, mariana Vargas <mmarianav_at_[hidden]> wrote:
>>
>>> MPI_init() is actually called when import MPI module from MPi package...
>>>
>>>
>>> On Sep 25, 2012, at 5:17 PM, Ralph Castain wrote:
>>>
>>>> You forgot to call MPI_Init at the beginning of your program.
>>>>
>>>> On Sep 25, 2012, at 2:08 PM, Mariana Vargas Magana <mmarianav_at_[hidden]> wrote:
>>>>
>>>>> Hi
>>>>> I think I'am not understanding what you said , here is the hello.py and next the command mpirun…
>>>>>
>>>>> Thanks!
>>>>>
>>>>> #!/usr/bin/env python
>>>>> """
>>>>> Parallel Hello World
>>>>> """
>>>>>
>>>>> from mpi4py import MPI
>>>>> import sys
>>>>>
>>>>> size = MPI.COMM_WORLD.Get_size()
>>>>> rank = MPI.COMM_WORLD.Get_rank()
>>>>> name = MPI.Get_processor_name()
>>>>>
>>>>> sys.stdout.write(
>>>>> "Hello, World! I am process %d of %d on %s.\n"
>>>>> % (rank, size, name))
>>>>>
>>>>> ~/bin/mpirun -np 70 python2.7 helloworld.py
>>>>> Hello, World! I am process 0 of 1 on ferrari.
>>>>> Hello, World! I am process 0 of 1 on ferrari.
>>>>> Hello, World! I am process 0 of 1 on ferrari.
>>>>> Hello, World! I am process 0 of 1 on ferrari.
>>>>> Hello, World! I am process 0 of 1 on ferrari.
>>>>> Hello, World! I am process 0 of 1 on ferrari.
>>>>> Hello, World! I am process 0 of 1 on ferrari.
>>>>> Hello, World! I am process 0 of 1 on ferrari.
>>>>> Hello, World! I am process 0 of 1 on ferrari.
>>>>> Hello, World! I am process 0 of 1 on ferrari.
>>>>> Hello, World! I am process 0 of 1 on ferrari.
>>>>> Hello, World! I am process 0 of 1 on ferrari.
>>>>> Hello, World! I am process 0 of 1 on ferrari.
>>>>>
>>>>> On Sep 25, 2012, at 4:46 PM, Ralph Castain <rhc_at_[hidden]> wrote:
>>>>>
>>>>>> The usual reason for this is that you aren't launching these processes correctly. How are you starting your job? Are you using mpirun?
>>>>>>
>>>>>>
>>>>>> On Sep 25, 2012, at 1:43 PM, mariana Vargas <mmarianav_at_[hidden]> wrote:
>>>>>>
>>>>>>> Hi
>>>>>>>
>>>>>>> I fact I found what is the origin of this problem and it is because all processes have rank 0, so I tested and in effect even when I send the clasical Hello.py give the same, how can I solved this?? Do I re installed every again???
>>>>>>>
>>>>>>> Help please...
>>>>>>>
>>>>>>> Mariana
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Sep 24, 2012, at 9:13 PM, Mariana Vargas Magana wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Yes you are right this is what it says but if fact the weird thing is that not all times the error message appears….I send to 20 nodes and only one gives this message, is this normal…
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Sep 24, 2012, at 8:00 PM, Ralph Castain <rhc_at_[hidden]> wrote:
>>>>>>>>
>>>>>>>>> Well, as it says, your processes called MPI_Init, but at least one of them exited without calling MPI_Finalize. That violates the MPI rules and we therefore terminate the remaining processes.
>>>>>>>>>
>>>>>>>>> Check your code and see how/why you are doing that - you probably have a code path whereby a process exits without calling finalize.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sep 24, 2012, at 4:37 PM, mariana Vargas <mmarianav_at_[hidden]> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Hi all
>>>>>>>>>>
>>>>>>>>>> I get this error when I run a paralelized python code in a cluster, could anyone give me an idea of what is happening? I'am new in this Thanks...
>>>>>>>>>>
>>>>>>>>>> mpirun has exited due to process rank 2 with PID 10259 on
>>>>>>>>>> node f01 exiting improperly. There are two reasons this could occur:
>>>>>>>>>>
>>>>>>>>>> 1. this process did not call "init" before exiting, but others in
>>>>>>>>>> the job did. This can cause a job to hang indefinitely while it waits
>>>>>>>>>> for all processes to call "init". By rule, if one process calls "init",
>>>>>>>>>> then ALL processes must call "init" prior to termination.
>>>>>>>>>>
>>>>>>>>>> 2. this process called "init", but exited without calling "finalize".
>>>>>>>>>> By rule, all processes that call "init" MUST call "finalize" prior to
>>>>>>>>>> exiting or it will be considered an "abnormal termination"
>>>>>>>>>>
>>>>>>>>>> This may have caused other processes in the application to be
>>>>>>>>>> terminated by signals sent by mpirun (as reported here).
>>>>>>>>>>
>>>>>>>>>> Thanks!!
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Dr. Mariana Vargas Magana
>>>>>>>>>>> Astroparticule et Cosmologie - Bureau 409B
>>>>>>>>>>> PHD student- Université Denis Diderot-Paris 7
>>>>>>>>>>> 10, rue Alice Domon et Léonie Duquet
>>>>>>>>>>> 75205 Paris Cedex - France
>>>>>>>>>>> Tel. +33 (0)1 57 27 70 32
>>>>>>>>>>> Fax. +33 (0)1 57 27 60 71
>>>>>>>>>>> mariana_at_[hidden]
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> users mailing list
>>>>>>>>>> users_at_[hidden]
>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> users_at_[hidden]
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> users_at_[hidden]
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> users_at_[hidden]
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> users_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/