Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] OpenMPI 1.3 and SGE 6.2u1
From: Rolf Vandevaart (Rolf.Vandevaart_at_[hidden])
Date: 2009-03-19 13:07:10


Your understanding is exactly right. This issue came up earlier today.
The suggestion was to add one of the following to your mpirun command.

--mca orte_leave_session_attached 1
-leave-session-attached

Here is the thread from earlier.

http://www.open-mpi.org/community/lists/users/2009/03/8511.php

Rolf

  Follow this email thread

On 03/19/09 12:19, Malone, Scott wrote:
> Since I'm new to openMPI I wanted to make sure that I understand this. When the jobs starts orted is daemonized and because of this they are not bound the sge_shephered on each node. This results in the loss of account for those processes. I guess that when I start mpirun with debugging, the orted is no longer daemonized and is attached to the sge_shephered? If this is true, is their anyway to started the orted not daemonized without turning on debugging until 1.3.2 is available?
>
>
> Thanks!
>
> Scott Malone
> Manager, High Performance Computing Facility
> Information Sciences - Research Informatics
> St. Jude Children's Research Hospital
> 332 North Lauderdale
> Memphis, TN 38105
> 901.495.4947
> scott.malone_at_[hidden]
>
>
>
>> -----Original Message-----
>> From: users-bounces_at_[hidden] [mailto:users-bounces_at_[hidden]] On
>> Behalf Of Reuti
>> Sent: Thursday, March 19, 2009 10:32 AM
>> To: Open MPI Users
>> Subject: Re: [OMPI users] OpenMPI 1.3 and SGE 6.2u1
>>
>> Hi,
>>
>> Am 19.03.2009 um 16:07 schrieb Malone, Scott:
>>
>>> I am having two problem with the integration of OpenMPI 1.3 and SGE
>>> 6.2u1, which we are new with both. The troubles are getting jobs
>>> to suspend/resume and collect cpu time correctly.
>>>
>>>
>>>
>>> For suspend/resume I have added the following to my mpirun command:
>>>
>>>
>>>
>>> --mca orte_forward_job_control 1 --mca plm_rsh_daemonize_qrsh 1
>>>
>> why? In 1.3 the orted is already daemonizing because of a bug and I
>> only found that it's necessary for the notify feature to daemonize
>> the orted.
>>
>>> and adjusted the suspend_method for the queue that it's running
>>> in. I have not gotten it to place any process into the T state.
>>> Although this is not a huge problem, I hope to have this working in
>>> the future.
>>>
>>>
>>>
>>> My main problem is getting the cpu time correct. On a multiple cpu
>>> job only the master nodes shows the cpu time correct for that
>>> process, the others are very short and not sure what they are
>>> measuring. (I believe startup time). Here's and example:
>>>
>> When the orted daemonize, they are no longer bound to the
>> sge_shephered. As a result of this, there is noone tracking their
>> accounting on the nodes. This will be fixed AFAIK in 1.3.2, so that
>> the daemons are still bound to a running sge_shephered.
>>
>> If you need the -notify feature and corerct accouting, you will need
>> to wait until the qrsh_starter in SGE is fixed not to exit when they
>> receive a usr1/2.
>>
>> -- Reuti
>>
>>>
>>> cpu 0.360
>>>
>>> cpu 0.480
>>>
>>> cpu 0.470
>>>
>>> cpu 0.490
>>>
>>> cpu 0.530
>>>
>>> cpu 0.470
>>>
>>> cpu 0.680
>>>
>>> cpu 464.305
>>>
>>>
>>>
>>> And from watching the runs that time is close to the wall clock
>>> time and match what I see for that single process. Now I have
>>> gotten it to give what I believe are correct values, but I have to
>>> include --debug-daemons option to our mpirun command. With that I
>>> get the following:
>>>
>>>
>>>
>>> cpu 73.146
>>>
>>> cpu 72.982
>>>
>>> cpu 73.381
>>>
>>> cpu 73.142
>>>
>>> cpu 73.029
>>>
>>> cpu 73.183
>>>
>>> cpu 73.117
>>>
>>> cpu 73.265
>>>
>>> cpu 73.236
>>>
>>>
>>>
>>> I have noticed that when I get the cpu time correctly I get qrsh
>>> process that startup (my understanding is that this is what starts
>>> the processes on the remote machines) and they stay running until
>>> the jobs is finished. When I don't get the correct cpu time, I see
>>> the qrsh processes start on the master node, but die off once they
>>> start the process on the remote nodes. The PE environment looks
>>> like the following:
>>>
>>>
>>>
>>>
>>>
>>> pe_name orte
>>>
>>> slots 560
>>>
>>> user_lists NONE
>>>
>>> xuser_lists NONE
>>>
>>> start_proc_args /bin/true
>>>
>>> stop_proc_args /bin/true
>>>
>>> allocation_rule $round_robin
>>>
>>> control_slaves TRUE
>>>
>>> job_is_first_task FALSE
>>>
>>> urgency_slots min
>>>
>>> accounting_summary FALSE
>>>
>>>
>>>
>>> Please let me know if I can provide any more information to help
>>> figure this out.
>>>
>>>
>>>
>>> Thanks,
>>>
>>>
>>>
>>> Scott Malone
>>>
>>> Manager, High Performance Computing Facility
>>>
>>> Information Sciences - Research Informatics
>>>
>>> St. Jude Children's Research Hospital
>>>
>>> 332 North Lauderdale
>>>
>>> Memphis, TN 38105
>>>
>>> 901.495.4947
>>>
>>> scott.malone_at_[hidden]
>>>
>>>
>>>
>>>
>>>
>>>
>>> Email Disclaimer: www.stjude.org/emaildisclaimer
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
=========================
rolf.vandevaart_at_[hidden]
781-442-3043
=========================