Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] How to cease the process triggered by OPENMPI
From: Brock Palen (brockp_at_[hidden])
Date: 2008-07-28 11:23:48


I don't see this this command in my 1.2.6 install. There also isn't
a man page.

Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
brockp_at_[hidden]
(734)936-1985

On Jul 28, 2008, at 11:15 AM, Rolf Vandevaart wrote:
>
> One other option which should kill of processes and cleanup is the
> orte-clean command. In your case, you could do the following:
>
> mpirun -hostfile ~/hostfile --pernode orte-clean
>
> There is a man page for it also.
>
> Rolf
>
> Brock Palen wrote:
>> You would be much better off to not use nohup, and then just kill
>> the mpirun.
>> What I mean is a batch system (http://www.clusterresources.com/
>> pages/products/torque-resource-manager.php). Most batch systems
>> have a launching system that lets you kill all the remote
>> processes when you kill the job. Look at how MPI works. When you
>> are starting the way you are starting MPI (without a batch system)
>> you are using ether ssh or rsh to start the remote processes.
>> Once these are started, the user has no control over the remote
>> processes. Try killing your mpirun not your orted or pw.x. You
>> will be much happier with a batch system. Or make a script that
>> ssh to hostfile and kills pw.x on all of them.
>> Brock Palen
>> www.umich.edu/~brockp
>> Center for Advanced Computing
>> brockp_at_[hidden] <mailto:brockp_at_[hidden]>
>> (734)936-1985
>> On Jul 27, 2008, at 2:04 PM, vega lew wrote:
>>> Dear Brock Palen,
>>>
>>> Thank you for your responding.
>>>
>>> My linux is redhat enterprise 4. My compiler is 10.1.015 version
>>> of intel fortran and intel c.
>>>
>>> You said 'when the job is killed all the children are also'
>>>
>>> But I started my OPENMPI job using the nohup command to put the
>>> job background like this,
>>> " nohup mpirun -hostfile ~/hostfile -np 64 pw.x < input > output
>>> & ".
>>>
>>> When I killed one of the process named pw.x, all the others
>>> didn't stop.
>>> When I killed the process named orted, the pw.x process in the
>>> same node stoped immediately,
>>> but the job in the other node were still running.
>>>
>>> Do you think there is something wrong with my cluster or openmpi
>>> or the software named pw.x?
>>>
>>> Is there a command for openmpi to force all the process to stop
>>> in the cluster or a list of nodes to stop.
>>> Vega Lew (weijia liu)
>>> PH.D Candidate in Chemical Engineering
>>> State Key Laboratory of Materials-oriented Chemical Engineering
>>> College of Chemistry and Chemical Engineering
>>> Nanjing University of Technology, 210009, Nanjing, Jiangsu, China
>>>
>>> --------------------------------------------------------------------
>>> ----
>>> From: brockp_at_[hidden] <mailto:brockp_at_[hidden]>
>>> Date: Sat, 26 Jul 2008 12:52:08 -0400
>>> To: users_at_[hidden] <mailto:users_at_[hidden]>
>>> Subject: Re: [OMPI users] How to cease the process triggered by
>>> OPENMPI
>>>
>>> Does the cluster your using use a batch system? Like SLURM, PBS
>>> or other?
>>>
>>> If so many have native ways to launch jobs that OMPI can use. SO
>>> that when the job is killed all the children are also.
>>>
>>> Brock Palen
>>> www.umich.edu/~brockp
>>> Center for Advanced Computing
>>> brockp_at_[hidden] <mailto:brockp_at_[hidden]>
>>> (734)936-1985
>>>
>>>
>>>
>>> On Jul 26, 2008, at 12:25 PM, vega lew wrote:
>>>
>>> Dear all,
>>>
>>> I have enjoyed the openmpi a couple of days. With the help of
>>> openmpi I could run ESPRESSO efficiently.
>>>
>>> I started the mpi-job by the openmpi command like this,
>>>
>>> " nohup mpirun -hostfile ~/hostfile -np 64 pw.x < input >
>>> output &".
>>>
>>> When I want to stop the job before it finished, I find it not
>>> easy
>>> to stop all the process manually. When I killed the process
>>> in one node of the cluster, the processes in other nodes were
>>> still running. So I must ssh to every node, find the
>>> process id and kill the process. If there are 100 processors or
>>> more for one mpi job, the situation even worse.
>>>
>>> Is there a command for openmpi to force all the process to
>>> stop in
>>> the cluster or a list of nodes to stop.
>>> vega
>>>
>>> Vega Lew (weijia liu)
>>> PH.D Candidate in Chemical Engineering
>>> State Key Laboratory of Materials-oriented Chemical Engineering
>>> College of Chemistry and Chemical Engineering
>>> Nanjing University of Technology, 210009, Nanjing, Jiangsu,
>>> China
>>>
>>> --------------------------------------------------------------------
>>> ----
>>> Explore the seven wonders of the world Learn more!
>>> <http://search.msn.com/results.aspx?q=7+wonders+world&mkt=en-
>>> US&form=QBRE>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden] <mailto:users_at_[hidden]>
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>>
>>> --------------------------------------------------------------------
>>> ----
>>> Get news, entertainment and everything you care about at
>>> Live.com. Check it out! <http://www.live.com/getstarted.aspx >
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden] <mailto:users_at_[hidden]>
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> ---------------------------------------------------------------------
>> ---
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
>
> =========================
> rolf.vandevaart_at_[hidden]
> 781-442-3043
> =========================
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>