Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] openmpi-1.7.4a1r29646 with -hostfile option under Torque manager
From: Ralph Castain (rhc_at_[hidden])
Date: 2013-12-10 19:34:06


No, that is actually correct. We map a socket until full, then move to the next. What you want is --map-by socket:span

On Dec 10, 2013, at 3:42 PM, tmishima_at_[hidden] wrote:

>
>
> Hi Ralph,
>
> I had a time to try your patch yesterday using openmpi-1.7.4a1r29646.
>
> It stopped the error but unfortunately "mapping by socket" itself didn't
> work
> well as shown bellow:
>
> [mishima_at_manage demos]$ qsub -I -l nodes=1:ppn=32
> qsub: waiting for job 8260.manage.cluster to start
> qsub: job 8260.manage.cluster ready
>
> [mishima_at_node04 ~]$ cd ~/Desktop/openmpi-1.7/demos/
> [mishima_at_node04 demos]$ mpirun -np 8 -report-bindings -cpus-per-proc 4
> -map-by socket myprog
> [node04.cluster:27489] MCW rank 2 bound to socket 1[core 8[hwt 0]], socket
> 1[core 9[hwt 0]], socket 1[core 10[hwt 0]], s
> ocket 1[core 11[hwt 0]]:
> [./././././././.][B/B/B/B/./././.][./././././././.][./././././././.]
> [node04.cluster:27489] MCW rank 3 bound to socket 1[core 12[hwt 0]], socket
> 1[core 13[hwt 0]], socket 1[core 14[hwt 0]],
> socket 1[core 15[hwt 0]]:
> [./././././././.][././././B/B/B/B][./././././././.][./././././././.]
> [node04.cluster:27489] MCW rank 4 bound to socket 2[core 16[hwt 0]], socket
> 2[core 17[hwt 0]], socket 2[core 18[hwt 0]],
> socket 2[core 19[hwt 0]]:
> [./././././././.][./././././././.][B/B/B/B/./././.][./././././././.]
> [node04.cluster:27489] MCW rank 5 bound to socket 2[core 20[hwt 0]], socket
> 2[core 21[hwt 0]], socket 2[core 22[hwt 0]],
> socket 2[core 23[hwt 0]]:
> [./././././././.][./././././././.][././././B/B/B/B][./././././././.]
> [node04.cluster:27489] MCW rank 6 bound to socket 3[core 24[hwt 0]], socket
> 3[core 25[hwt 0]], socket 3[core 26[hwt 0]],
> socket 3[core 27[hwt 0]]:
> [./././././././.][./././././././.][./././././././.][B/B/B/B/./././.]
> [node04.cluster:27489] MCW rank 7 bound to socket 3[core 28[hwt 0]], socket
> 3[core 29[hwt 0]], socket 3[core 30[hwt 0]],
> socket 3[core 31[hwt 0]]:
> [./././././././.][./././././././.][./././././././.][././././B/B/B/B]
> [node04.cluster:27489] MCW rank 0 bound to socket 0[core 0[hwt 0]], socket
> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
> cket 0[core 3[hwt 0]]:
> [B/B/B/B/./././.][./././././././.][./././././././.][./././././././.]
> [node04.cluster:27489] MCW rank 1 bound to socket 0[core 4[hwt 0]], socket
> 0[core 5[hwt 0]], socket 0[core 6[hwt 0]], so
> cket 0[core 7[hwt 0]]:
> [././././B/B/B/B][./././././././.][./././././././.][./././././././.]
> Hello world from process 2 of 8
> Hello world from process 1 of 8
> Hello world from process 3 of 8
> Hello world from process 0 of 8
> Hello world from process 6 of 8
> Hello world from process 5 of 8
> Hello world from process 4 of 8
> Hello world from process 7 of 8
>
> I think this should be like this:
>
> rank 00
> [B/B/B/B/./././.][./././././././.][./././././././.][./././././././.]
> rank 01
> [./././././././.][B/B/B/B/./././.][./././././././.][./././././././.]
> rank 02
> [./././././././.][./././././././.][B/B/B/B/./././.][./././././././.]
> ...
>
> Regards,
> Tetsuya Mishima
>
>> I fixed this under the trunk (was an issue regardless of RM) and have
> scheduled it for 1.7.4.
>>
>> Thanks!
>> Ralph
>>
>> On Nov 25, 2013, at 4:22 PM, tmishima_at_[hidden] wrote:
>>
>>>
>>>
>>> Hi Ralph,
>>>
>>> Thank you very much for your quick response.
>>>
>>> I'm afraid to say that I found one more issuse...
>>>
>>> It's not so serious. Please check it when you have a lot of time.
>>>
>>> The problem is cpus-per-proc with -map-by option under Torque manager.
>>> It doesn't work as shown below. I guess you can get the same
>>> behaviour under Slurm manager.
>>>
>>> Of course, if I remove -map-by option, it works quite well.
>>>
>>> [mishima_at_manage testbed2]$ qsub -I -l nodes=1:ppn=32
>>> qsub: waiting for job 8116.manage.cluster to start
>>> qsub: job 8116.manage.cluster ready
>>>
>>> [mishima_at_node03 ~]$ cd ~/Ducom/testbed2
>>> [mishima_at_node03 testbed2]$ mpirun -np 8 -report-bindings -cpus-per-proc
> 4
>>> -map-by socket mPre
>>>
> --------------------------------------------------------------------------
>>> A request was made to bind to that would result in binding more
>>> processes than cpus on a resource:
>>>
>>> Bind to: CORE
>>> Node: node03
>>> #processes: 2
>>> #cpus: 1
>>>
>>> You can override this protection by adding the "overload-allowed"
>>> option to your binding directive.
>>>
> --------------------------------------------------------------------------
>>>
>>>
>>> [mishima_at_node03 testbed2]$ mpirun -np 8 -report-bindings -cpus-per-proc
> 4
>>> mPre
>>> [node03.cluster:18128] MCW rank 2 bound to socket 1[core 8[hwt 0]],
> socket
>>> 1[core 9[hwt 0]], socket 1[core 10[hwt 0]], s
>>> ocket 1[core 11[hwt 0]]:
>>> [./././././././.][B/B/B/B/./././.][./././././././.][./././././././.]
>>> [node03.cluster:18128] MCW rank 3 bound to socket 1[core 12[hwt 0]],
> socket
>>> 1[core 13[hwt 0]], socket 1[core 14[hwt 0]],
>>> socket 1[core 15[hwt 0]]:
>>> [./././././././.][././././B/B/B/B][./././././././.][./././././././.]
>>> [node03.cluster:18128] MCW rank 4 bound to socket 2[core 16[hwt 0]],
> socket
>>> 2[core 17[hwt 0]], socket 2[core 18[hwt 0]],
>>> socket 2[core 19[hwt 0]]:
>>> [./././././././.][./././././././.][B/B/B/B/./././.][./././././././.]
>>> [node03.cluster:18128] MCW rank 5 bound to socket 2[core 20[hwt 0]],
> socket
>>> 2[core 21[hwt 0]], socket 2[core 22[hwt 0]],
>>> socket 2[core 23[hwt 0]]:
>>> [./././././././.][./././././././.][././././B/B/B/B][./././././././.]
>>> [node03.cluster:18128] MCW rank 6 bound to socket 3[core 24[hwt 0]],
> socket
>>> 3[core 25[hwt 0]], socket 3[core 26[hwt 0]],
>>> socket 3[core 27[hwt 0]]:
>>> [./././././././.][./././././././.][./././././././.][B/B/B/B/./././.]
>>> [node03.cluster:18128] MCW rank 7 bound to socket 3[core 28[hwt 0]],
> socket
>>> 3[core 29[hwt 0]], socket 3[core 30[hwt 0]],
>>> socket 3[core 31[hwt 0]]:
>>> [./././././././.][./././././././.][./././././././.][././././B/B/B/B]
>>> [node03.cluster:18128] MCW rank 0 bound to socket 0[core 0[hwt 0]],
> socket
>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
>>> cket 0[core 3[hwt 0]]:
>>> [B/B/B/B/./././.][./././././././.][./././././././.][./././././././.]
>>> [node03.cluster:18128] MCW rank 1 bound to socket 0[core 4[hwt 0]],
> socket
>>> 0[core 5[hwt 0]], socket 0[core 6[hwt 0]], so
>>> cket 0[core 7[hwt 0]]:
>>> [././././B/B/B/B][./././././././.][./././././././.][./././././././.]
>>>
>>> Regards,
>>> Tetsuya Mishima
>>>
>>>> Fixed and scheduled to move to 1.7.4. Thanks again!
>>>>
>>>>
>>>> On Nov 17, 2013, at 6:11 PM, Ralph Castain <rhc_at_[hidden]> wrote:
>>>>
>>>> Thanks! That's precisely where I was going to look when I had time :-)
>>>>
>>>> I'll update tomorrow.
>>>> Ralph
>>>>
>>>>
>>>>
>>>>
>>>> On Sun, Nov 17, 2013 at 7:01 PM, <tmishima_at_[hidden]>wrote:
>>>>
>>>>
>>>> Hi Ralph,
>>>>
>>>> This is the continuous story of "Segmentation fault in oob_tcp.c of
>>>> openmpi-1.7.4a1r29646".
>>>>
>>>> I found the cause.
>>>>
>>>> Firstly, I noticed that your hostfile can work and mine can not.
>>>>
>>>> Your host file:
>>>> cat hosts
>>>> bend001 slots=12
>>>>
>>>> My host file:
>>>> cat hosts
>>>> node08
>>>> node08
>>>> ...(total 8 lines)
>>>>
>>>> I modified my script file to add "slots=1" to each line of my hostfile
>>>> just before launching mpirun. Then it worked.
>>>>
>>>> My host file(modified):
>>>> cat hosts
>>>> node08 slots=1
>>>> node08 slots=1
>>>> ...(total 8 lines)
>>>>
>>>> Secondary, I confirmed that there's a slight difference between
>>>> orte/util/hostfile/hostfile.c of 1.7.3 and that of 1.7.4a1r29646.
>>>>
>>>> $ diff
>>>> hostfile.c.org ../../../../openmpi-1.7.3/orte/util/hostfile/hostfile.c
>>>> 394,401c394,399
>>>> < if (got_count) {
>>>> < node->slots_given = true;
>>>> < } else if (got_max) {
>>>> < node->slots = node->slots_max;
>>>> < node->slots_given = true;
>>>> < } else {
>>>> < /* should be set by obj_new, but just to be clear */
>>>> < node->slots_given = false;
>>>> ---
>>>>> if (!got_count) {
>>>>> if (got_max) {
>>>>> node->slots = node->slots_max;
>>>>> } else {
>>>>> ++node->slots;
>>>>> }
>>>> ....
>>>>
>>>> Finally, I added the line 402 below just as a tentative trial.
>>>> Then, it worked.
>>>>
>>>> cat -n orte/util/hostfile/hostfile.c:
>>>> ...
>>>> 394 if (got_count) {
>>>> 395 node->slots_given = true;
>>>> 396 } else if (got_max) {
>>>> 397 node->slots = node->slots_max;
>>>> 398 node->slots_given = true;
>>>> 399 } else {
>>>> 400 /* should be set by obj_new, but just to be clear */
>>>> 401 node->slots_given = false;
>>>> 402 ++node->slots; /* added by tmishima */
>>>> 403 }
>>>> ...
>>>>
>>>> Please fix the problem properly, because it's just based on my
>>>> random guess. It's related to the treatment of hostfile where slots
>>>> information is not given.
>>>>
>>>> Regards,
>>>> Tetsuya Mishima
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>>
>>>
> http://www.open-mpi.org/mailman/listinfo.cgi/users_______________________________________________
>
>>>
>>>> users mailing list
>>>> users_at_[hidden]http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users