Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] ompi-clean on single executable
From: Reuti (reuti_at_[hidden])
Date: 2012-10-24 05:55:01


Am 24.10.2012 um 11:33 schrieb Nicolas Deladerriere:

> Reuti,
>
> Thanks for your comments,
>
> In our case, we are currently running different mpirun commands on
> clusters sharing the same frontend. Basically we use a wrapper to run
> the mpirun command and to run an ompi-clean command to clean up the
> mpi job if required.
> Using ompi-clean like this just kills all other mpi jobs running on
> same frontend. I cannot use queuing system

Why? Using it on a single machine was only one possible setup. Its purpose is to distribute jobs to slave hosts. If you have already one frontend as login-machine it fits perfect: the qmaster (in case of SGE) can run there and the execd on the nodes.

-- Reuti

> as you have suggested this
> is why I was wondering a option or other solution associated to
> ompi-clean command to avoid this general mpi jobs cleaning.
>
> Cheers
> Nicolas
>
> 2012/10/24, Reuti <reuti_at_[hidden]>:
>> Hi,
>>
>> Am 24.10.2012 um 09:36 schrieb Nicolas Deladerriere:
>>
>>> I am having issue running ompi-clean which clean up (this is normal)
>>> session associated to a user which means it kills all running jobs
>>> assoicated to this session (this is also normal). But I would like to be
>>> able to clean up session associated to a job (a not user).
>>>
>>> Here is my point:
>>>
>>> I am running two executable :
>>>
>>> % mpirun -np 2 myexec1
>>> --> run with PID 2399 ...
>>> % mpirun -np 2 myexec2
>>> --> run with PID 2402 ...
>>>
>>> When I run orte-clean I got this result :
>>> % orte-clean -v
>>> orte-clean: cleaning session dir tree openmpi-sessions-ndelader_at_myhost_0
>>> orte-clean: killing any lingering procs
>>> orte-clean: found potential rogue orterun process
>>> (pid=2399,user=ndelader), sending SIGKILL...
>>> orte-clean: found potential rogue orterun process
>>> (pid=2402,user=ndelader), sending SIGKILL...
>>>
>>> Which means that both jobs have been killed :-(
>>> Basically I would like to perform orte-clean using executable name or PID
>>> or whatever that identify which job I want to stop an clean. It seems I
>>> would need to create an openmpi session per job. Does it make sense ? And
>>> I would like to be able to do something like following command and get
>>> following result :
>>>
>>> % orte-clean -v myexec1
>>> orte-clean: cleaning session dir tree openmpi-sessions-ndelader_at_myhost_0
>>> orte-clean: killing any lingering procs
>>> orte-clean: found potential rogue orterun process
>>> (pid=2399,user=ndelader), sending SIGKILL...
>>>
>>>
>>> Does it make sense ? Is there a way to perform this kind of selection in
>>> cleaning process ?
>>
>> How many jobs are you starting on how many nodes at one time? This
>> requirement could be a point to start to use a queuing system, where can
>> remove job individually and also serialize your workflow. In fact: we use
>> GridEngine also local on workstations for this purpose.
>>
>> -- Reuti
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users