Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] another corner case hangup in openmpi-1.7.5rc3
From: Ralph Castain (rhc_at_[hidden])
Date: 2014-03-24 11:16:31


The "updated"field in the orte_job_t structure is only used to help reduce the size of the launch message sent to all the daemons. Basically, we only include info on jobs that have been changed - thus, it only gets used when the app calls comm_spawn. After every launch, we automatically change it to false and require that we change it to true if the number of daemons changes.

Since we won't have added any daemons, we don't really need to update the field, but probably should change do as you suggest just to ensure the value is right. Thanks!
Ralph

On Mar 18, 2014, at 4:31 PM, tmishima_at_[hidden] wrote:

>
>
> I confirmed your fix worked good for me. But, I guess at least
> we should add the line "daemons->updated = false;" in the last
> if-clause, although I'm not sure how the variable is used.
> Is it okay, Ralph?
>
> Tetsuya
>
>> Understood, and your logic is correct. It's just that I'd rather each
> launcher decide to declare the daemons as reported rather than doing it in
> the common code, just in case someone writes a
>> launcher where they choose to respond differently to the case where no
> new daemons need to be launched.
>>
>>
>> On Mar 17, 2014, at 6:43 PM, tmishima_at_[hidden] wrote:
>>
>>>
>>>
>>> I do not understand your fix yet, but it would be better, I guess.
>>>
>>> I'll check it later, but now please let me expalin what I thought:
>>>
>>> If some nodes are allocated, it doen't go through this part because
>>> opal_list_get_size(&nodes) > 0 at this location.
>>>
>>> 1590 if (0 == opal_list_get_size(&nodes)) {
>>> 1591 OPAL_OUTPUT_VERBOSE((5,
>>> orte_plm_base_framework.framework_output,
>>> 1592 "%s plm:base:setup_vm only HNP in
>>> allocation",
>>> 1593 ORTE_NAME_PRINT(ORTE_PROC_MY_NAME)));
>>> 1594 /* cleanup */
>>> 1595 OBJ_DESTRUCT(&nodes);
>>> 1596 /* mark that the daemons have reported so we can proceed */
>>> 1597 daemons->state = ORTE_JOB_STATE_DAEMONS_REPORTED;
>>> 1598 daemons->updated = false;
>>> 1599 return ORTE_SUCCESS;
>>> 1600 }
>>>
>>> After filtering, opal_list_get_size(&nodes) becomes zero at this
> location.
>>> That's why I think I should add two lines 1597,1598 to the if-clause
> below.
>>>
>>> 1660 if (0 == opal_list_get_size(&nodes)) {
>>> 1661 OPAL_OUTPUT_VERBOSE((5,
>>> orte_plm_base_framework.framework_output,
>>> 1662 "%s plm:base:setup_vm only HNP left",
>>> 1663 ORTE_NAME_PRINT(ORTE_PROC_MY_NAME)));
>>> 1664 OBJ_DESTRUCT(&nodes);
>>> 1665 return ORTE_SUCCESS;
>>>
>>> Tetsuya
>>>
>>>> Hmm...no, I don't think that's the correct patch. We want that
> function
>>> to remain "clean" as it's job is simply to construct the list of nodes
> for
>>> the VM. It's the responsibility of the launcher to
>>>> decide what to do with it.
>>>>
>>>> Please see https://svn.open-mpi.org/trac/ompi/ticket/4408 for a fix
>>>>
>>>> Ralph
>>>>
>>>> On Mar 17, 2014, at 5:40 PM, tmishima_at_[hidden] wrote:
>>>>
>>>>>
>>>>> Hi Ralph, I found another corner case hangup in openmpi-1.7.5rc3.
>>>>>
>>>>> Condition:
>>>>> 1. allocate some nodes using RM such as TORQUE.
>>>>> 2. request the head node only in executing the job with
>>>>> -host or -hostfile option.
>>>>>
>>>>> Example:
>>>>> 1. allocate node05,node06 using TORQUE.
>>>>> 2. request node05 only with -host option
>>>>>
>>>>> [mishima_at_manage ~]$ qsub -I -l nodes=node05+node06
>>>>> qsub: waiting for job 8661.manage.cluster to start
>>>>> qsub: job 8661.manage.cluster ready
>>>>>
>>>>> [mishima_at_node05 ~]$ cat $PBS_NODEFILE
>>>>> node05
>>>>> node06
>>>>> [mishima_at_node05 ~]$ mpirun -np 1 -host node05
>>> ~/mis/openmpi/demos/myprog
>>>>> << hang here >>
>>>>>
>>>>> And, my fix for plm_base_launch_support.c is as follows:
>>>>> --- plm_base_launch_support.c 2014-03-12 05:51:45.000000000 +0900
>>>>> +++ plm_base_launch_support.try.c 2014-03-18 08:38:03.000000000
>>> +0900
>>>>> @@ -1662,7 +1662,11 @@
>>>>> OPAL_OUTPUT_VERBOSE((5,
>>> orte_plm_base_framework.framework_output,
>>>>> "%s plm:base:setup_vm only HNP left",
>>>>> ORTE_NAME_PRINT(ORTE_PROC_MY_NAME)));
>>>>> + /* cleanup */
>>>>> OBJ_DESTRUCT(&nodes);
>>>>> + /* mark that the daemons have reported so we can proceed */
>>>>> + daemons->state = ORTE_JOB_STATE_DAEMONS_REPORTED;
>>>>> + daemons->updated = false;
>>>>> return ORTE_SUCCESS;
>>>>> }
>>>>>
>>>>> Tetsuya
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users