Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] openmpi-1.7.4a1r29646 with -hostfile option under Torque manager
From: tmishima_at_[hidden]
Date: 2013-12-10 21:05:58


Hi Ralph,

I tried again with -cpus-per-proc 2 as shown below.
Here, I found that "-map-by socket:span" worked well.

[mishima_at_node03 demos]$ mpirun -np 8 -report-bindings -cpus-per-proc 2
-map-by socket:span myprog
[node03.cluster:10879] MCW rank 2 bound to socket 1[core 8[hwt 0]], socket
1[core 9[hwt 0]]: [./././././././.][B/B/././.
/././.][./././././././.][./././././././.]
[node03.cluster:10879] MCW rank 3 bound to socket 1[core 10[hwt 0]], socket
1[core 11[hwt 0]]: [./././././././.][././B/B
/./././.][./././././././.][./././././././.]
[node03.cluster:10879] MCW rank 4 bound to socket 2[core 16[hwt 0]], socket
2[core 17[hwt 0]]: [./././././././.][./././.
/./././.][B/B/./././././.][./././././././.]
[node03.cluster:10879] MCW rank 5 bound to socket 2[core 18[hwt 0]], socket
2[core 19[hwt 0]]: [./././././././.][./././.
/./././.][././B/B/./././.][./././././././.]
[node03.cluster:10879] MCW rank 6 bound to socket 3[core 24[hwt 0]], socket
3[core 25[hwt 0]]: [./././././././.][./././.
/./././.][./././././././.][B/B/./././././.]
[node03.cluster:10879] MCW rank 7 bound to socket 3[core 26[hwt 0]], socket
3[core 27[hwt 0]]: [./././././././.][./././.
/./././.][./././././././.][././B/B/./././.]
[node03.cluster:10879] MCW rank 0 bound to socket 0[core 0[hwt 0]], socket
0[core 1[hwt 0]]: [B/B/./././././.][././././.
/././.][./././././././.][./././././././.]
[node03.cluster:10879] MCW rank 1 bound to socket 0[core 2[hwt 0]], socket
0[core 3[hwt 0]]: [././B/B/./././.][././././.
/././.][./././././././.][./././././././.]
Hello world from process 1 of 8
Hello world from process 0 of 8
Hello world from process 4 of 8
Hello world from process 2 of 8
Hello world from process 7 of 8
Hello world from process 6 of 8
Hello world from process 5 of 8
Hello world from process 3 of 8
[mishima_at_node03 demos]$ mpirun -np 8 -report-bindings -cpus-per-proc 2
-map-by socket myprog
[node03.cluster:10921] MCW rank 2 bound to socket 0[core 4[hwt 0]], socket
0[core 5[hwt 0]]: [././././B/B/./.][././././.
/././.][./././././././.][./././././././.]
[node03.cluster:10921] MCW rank 3 bound to socket 0[core 6[hwt 0]], socket
0[core 7[hwt 0]]: [././././././B/B][././././.
/././.][./././././././.][./././././././.]
[node03.cluster:10921] MCW rank 4 bound to socket 1[core 8[hwt 0]], socket
1[core 9[hwt 0]]: [./././././././.][B/B/././.
/././.][./././././././.][./././././././.]
[node03.cluster:10921] MCW rank 5 bound to socket 1[core 10[hwt 0]], socket
1[core 11[hwt 0]]: [./././././././.][././B/B
/./././.][./././././././.][./././././././.]
[node03.cluster:10921] MCW rank 6 bound to socket 1[core 12[hwt 0]], socket
1[core 13[hwt 0]]: [./././././././.][./././.
/B/B/./.][./././././././.][./././././././.]
[node03.cluster:10921] MCW rank 7 bound to socket 1[core 14[hwt 0]], socket
1[core 15[hwt 0]]: [./././././././.][./././.
/././B/B][./././././././.][./././././././.]
[node03.cluster:10921] MCW rank 0 bound to socket 0[core 0[hwt 0]], socket
0[core 1[hwt 0]]: [B/B/./././././.][././././.
/././.][./././././././.][./././././././.]
[node03.cluster:10921] MCW rank 1 bound to socket 0[core 2[hwt 0]], socket
0[core 3[hwt 0]]: [././B/B/./././.][././././.
/././.][./././././././.][./././././././.]
Hello world from process 5 of 8
Hello world from process 1 of 8
Hello world from process 6 of 8
Hello world from process 4 of 8
Hello world from process 2 of 8
Hello world from process 0 of 8
Hello world from process 7 of 8
Hello world from process 3 of 8

"-np 8" and "-cpus-per-proc 4" just filled all sockets.
In this case, I guess "-map-by socket:span" and "-map-by socket" has same
meaning.
Therefore, there's no problem about that. Sorry for distubing.

By the way, through this test, I found another problem.
Without torque manager and just using rsh, it causes the same error like
below:

[mishima_at_manage openmpi-1.7]$ rsh node03
Last login: Wed Dec 11 09:42:02 from manage
[mishima_at_node03 ~]$ cd ~/Desktop/openmpi-1.7/demos/
[mishima_at_node03 demos]$ mpirun -np 8 -report-bindings -cpus-per-proc 4
-map-by socket myprog
--------------------------------------------------------------------------
A request was made to bind to that would result in binding more
processes than cpus on a resource:

   Bind to: CORE
   Node: node03
   #processes: 2
   #cpus: 1

You can override this protection by adding the "overload-allowed"
option to your binding directive.
--------------------------------------------------------------------------
[mishima_at_node03 demos]$
[mishima_at_node03 demos]$ mpirun -np 8 -report-bindings -cpus-per-proc 4
myprog
[node03.cluster:11036] MCW rank 2 bound to socket 1[core 8[hwt 0]], socket
1[core 9[hwt 0]], socket 1[core 10[hwt 0]], s
ocket 1[core 11[hwt 0]]:
[./././././././.][B/B/B/B/./././.][./././././././.][./././././././.]
[node03.cluster:11036] MCW rank 3 bound to socket 1[core 12[hwt 0]], socket
1[core 13[hwt 0]], socket 1[core 14[hwt 0]],
 socket 1[core 15[hwt 0]]:
[./././././././.][././././B/B/B/B][./././././././.][./././././././.]
[node03.cluster:11036] MCW rank 4 bound to socket 2[core 16[hwt 0]], socket
2[core 17[hwt 0]], socket 2[core 18[hwt 0]],
 socket 2[core 19[hwt 0]]:
[./././././././.][./././././././.][B/B/B/B/./././.][./././././././.]
[node03.cluster:11036] MCW rank 5 bound to socket 2[core 20[hwt 0]], socket
2[core 21[hwt 0]], socket 2[core 22[hwt 0]],
 socket 2[core 23[hwt 0]]:
[./././././././.][./././././././.][././././B/B/B/B][./././././././.]
[node03.cluster:11036] MCW rank 6 bound to socket 3[core 24[hwt 0]], socket
3[core 25[hwt 0]], socket 3[core 26[hwt 0]],
 socket 3[core 27[hwt 0]]:
[./././././././.][./././././././.][./././././././.][B/B/B/B/./././.]
[node03.cluster:11036] MCW rank 7 bound to socket 3[core 28[hwt 0]], socket
3[core 29[hwt 0]], socket 3[core 30[hwt 0]],
 socket 3[core 31[hwt 0]]:
[./././././././.][./././././././.][./././././././.][././././B/B/B/B]
[node03.cluster:11036] MCW rank 0 bound to socket 0[core 0[hwt 0]], socket
0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
cket 0[core 3[hwt 0]]:
[B/B/B/B/./././.][./././././././.][./././././././.][./././././././.]
[node03.cluster:11036] MCW rank 1 bound to socket 0[core 4[hwt 0]], socket
0[core 5[hwt 0]], socket 0[core 6[hwt 0]], so
cket 0[core 7[hwt 0]]:
[././././B/B/B/B][./././././././.][./././././././.][./././././././.]
Hello world from process 4 of 8
Hello world from process 2 of 8
Hello world from process 6 of 8
Hello world from process 5 of 8
Hello world from process 3 of 8
Hello world from process 7 of 8
Hello world from process 0 of 8
Hello world from process 1 of 8

Regards,
Tetsuya Mishima

> Hmmm...that's strange. I only have 2 sockets on my system, but let me
poke around a bit and see what might be happening.
>
> On Dec 10, 2013, at 4:47 PM, tmishima_at_[hidden] wrote:
>
> >
> >
> > Hi Ralph,
> >
> > Thanks. I didn't know the meaning of "socket:span".
> >
> > But it still causes the problem, which seems socket:span doesn't work.
> >
> > [mishima_at_manage demos]$ qsub -I -l nodes=node03:ppn=32
> > qsub: waiting for job 8265.manage.cluster to start
> > qsub: job 8265.manage.cluster ready
> >
> > [mishima_at_node03 ~]$ cd ~/Desktop/openmpi-1.7/demos/
> > [mishima_at_node03 demos]$ mpirun -np 8 -report-bindings -cpus-per-proc 4
> > -map-by socket:span myprog
> > [node03.cluster:10262] MCW rank 2 bound to socket 1[core 8[hwt 0]],
socket
> > 1[core 9[hwt 0]], socket 1[core 10[hwt 0]], s
> > ocket 1[core 11[hwt 0]]:
> > [./././././././.][B/B/B/B/./././.][./././././././.][./././././././.]
> > [node03.cluster:10262] MCW rank 3 bound to socket 1[core 12[hwt 0]],
socket
> > 1[core 13[hwt 0]], socket 1[core 14[hwt 0]],
> > socket 1[core 15[hwt 0]]:
> > [./././././././.][././././B/B/B/B][./././././././.][./././././././.]
> > [node03.cluster:10262] MCW rank 4 bound to socket 2[core 16[hwt 0]],
socket
> > 2[core 17[hwt 0]], socket 2[core 18[hwt 0]],
> > socket 2[core 19[hwt 0]]:
> > [./././././././.][./././././././.][B/B/B/B/./././.][./././././././.]
> > [node03.cluster:10262] MCW rank 5 bound to socket 2[core 20[hwt 0]],
socket
> > 2[core 21[hwt 0]], socket 2[core 22[hwt 0]],
> > socket 2[core 23[hwt 0]]:
> > [./././././././.][./././././././.][././././B/B/B/B][./././././././.]
> > [node03.cluster:10262] MCW rank 6 bound to socket 3[core 24[hwt 0]],
socket
> > 3[core 25[hwt 0]], socket 3[core 26[hwt 0]],
> > socket 3[core 27[hwt 0]]:
> > [./././././././.][./././././././.][./././././././.][B/B/B/B/./././.]
> > [node03.cluster:10262] MCW rank 7 bound to socket 3[core 28[hwt 0]],
socket
> > 3[core 29[hwt 0]], socket 3[core 30[hwt 0]],
> > socket 3[core 31[hwt 0]]:
> > [./././././././.][./././././././.][./././././././.][././././B/B/B/B]
> > [node03.cluster:10262] MCW rank 0 bound to socket 0[core 0[hwt 0]],
socket
> > 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
> > cket 0[core 3[hwt 0]]:
> > [B/B/B/B/./././.][./././././././.][./././././././.][./././././././.]
> > [node03.cluster:10262] MCW rank 1 bound to socket 0[core 4[hwt 0]],
socket
> > 0[core 5[hwt 0]], socket 0[core 6[hwt 0]], so
> > cket 0[core 7[hwt 0]]:
> > [././././B/B/B/B][./././././././.][./././././././.][./././././././.]
> > Hello world from process 0 of 8
> > Hello world from process 3 of 8
> > Hello world from process 1 of 8
> > Hello world from process 4 of 8
> > Hello world from process 6 of 8
> > Hello world from process 5 of 8
> > Hello world from process 2 of 8
> > Hello world from process 7 of 8
> >
> > Regards,
> > Tetsuya Mishima
> >
> >> No, that is actually correct. We map a socket until full, then move to
> > the next. What you want is --map-by socket:span
> >>
> >> On Dec 10, 2013, at 3:42 PM, tmishima_at_[hidden] wrote:
> >>
> >>>
> >>>
> >>> Hi Ralph,
> >>>
> >>> I had a time to try your patch yesterday using openmpi-1.7.4a1r29646.
> >>>
> >>> It stopped the error but unfortunately "mapping by socket" itself
> > didn't
> >>> work
> >>> well as shown bellow:
> >>>
> >>> [mishima_at_manage demos]$ qsub -I -l nodes=1:ppn=32
> >>> qsub: waiting for job 8260.manage.cluster to start
> >>> qsub: job 8260.manage.cluster ready
> >>>
> >>> [mishima_at_node04 ~]$ cd ~/Desktop/openmpi-1.7/demos/
> >>> [mishima_at_node04 demos]$ mpirun -np 8 -report-bindings -cpus-per-proc
4
> >>> -map-by socket myprog
> >>> [node04.cluster:27489] MCW rank 2 bound to socket 1[core 8[hwt 0]],
> > socket
> >>> 1[core 9[hwt 0]], socket 1[core 10[hwt 0]], s
> >>> ocket 1[core 11[hwt 0]]:
> >>> [./././././././.][B/B/B/B/./././.][./././././././.][./././././././.]
> >>> [node04.cluster:27489] MCW rank 3 bound to socket 1[core 12[hwt 0]],
> > socket
> >>> 1[core 13[hwt 0]], socket 1[core 14[hwt 0]],
> >>> socket 1[core 15[hwt 0]]:
> >>> [./././././././.][././././B/B/B/B][./././././././.][./././././././.]
> >>> [node04.cluster:27489] MCW rank 4 bound to socket 2[core 16[hwt 0]],
> > socket
> >>> 2[core 17[hwt 0]], socket 2[core 18[hwt 0]],
> >>> socket 2[core 19[hwt 0]]:
> >>> [./././././././.][./././././././.][B/B/B/B/./././.][./././././././.]
> >>> [node04.cluster:27489] MCW rank 5 bound to socket 2[core 20[hwt 0]],
> > socket
> >>> 2[core 21[hwt 0]], socket 2[core 22[hwt 0]],
> >>> socket 2[core 23[hwt 0]]:
> >>> [./././././././.][./././././././.][././././B/B/B/B][./././././././.]
> >>> [node04.cluster:27489] MCW rank 6 bound to socket 3[core 24[hwt 0]],
> > socket
> >>> 3[core 25[hwt 0]], socket 3[core 26[hwt 0]],
> >>> socket 3[core 27[hwt 0]]:
> >>> [./././././././.][./././././././.][./././././././.][B/B/B/B/./././.]
> >>> [node04.cluster:27489] MCW rank 7 bound to socket 3[core 28[hwt 0]],
> > socket
> >>> 3[core 29[hwt 0]], socket 3[core 30[hwt 0]],
> >>> socket 3[core 31[hwt 0]]:
> >>> [./././././././.][./././././././.][./././././././.][././././B/B/B/B]
> >>> [node04.cluster:27489] MCW rank 0 bound to socket 0[core 0[hwt 0]],
> > socket
> >>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
> >>> cket 0[core 3[hwt 0]]:
> >>> [B/B/B/B/./././.][./././././././.][./././././././.][./././././././.]
> >>> [node04.cluster:27489] MCW rank 1 bound to socket 0[core 4[hwt 0]],
> > socket
> >>> 0[core 5[hwt 0]], socket 0[core 6[hwt 0]], so
> >>> cket 0[core 7[hwt 0]]:
> >>> [././././B/B/B/B][./././././././.][./././././././.][./././././././.]
> >>> Hello world from process 2 of 8
> >>> Hello world from process 1 of 8
> >>> Hello world from process 3 of 8
> >>> Hello world from process 0 of 8
> >>> Hello world from process 6 of 8
> >>> Hello world from process 5 of 8
> >>> Hello world from process 4 of 8
> >>> Hello world from process 7 of 8
> >>>
> >>> I think this should be like this:
> >>>
> >>> rank 00
> >>> [B/B/B/B/./././.][./././././././.][./././././././.][./././././././.]
> >>> rank 01
> >>> [./././././././.][B/B/B/B/./././.][./././././././.][./././././././.]
> >>> rank 02
> >>> [./././././././.][./././././././.][B/B/B/B/./././.][./././././././.]
> >>> ...
> >>>
> >>> Regards,
> >>> Tetsuya Mishima
> >>>
> >>>> I fixed this under the trunk (was an issue regardless of RM) and
have
> >>> scheduled it for 1.7.4.
> >>>>
> >>>> Thanks!
> >>>> Ralph
> >>>>
> >>>> On Nov 25, 2013, at 4:22 PM, tmishima_at_[hidden] wrote:
> >>>>
> >>>>>
> >>>>>
> >>>>> Hi Ralph,
> >>>>>
> >>>>> Thank you very much for your quick response.
> >>>>>
> >>>>> I'm afraid to say that I found one more issuse...
> >>>>>
> >>>>> It's not so serious. Please check it when you have a lot of time.
> >>>>>
> >>>>> The problem is cpus-per-proc with -map-by option under Torque
> > manager.
> >>>>> It doesn't work as shown below. I guess you can get the same
> >>>>> behaviour under Slurm manager.
> >>>>>
> >>>>> Of course, if I remove -map-by option, it works quite well.
> >>>>>
> >>>>> [mishima_at_manage testbed2]$ qsub -I -l nodes=1:ppn=32
> >>>>> qsub: waiting for job 8116.manage.cluster to start
> >>>>> qsub: job 8116.manage.cluster ready
> >>>>>
> >>>>> [mishima_at_node03 ~]$ cd ~/Ducom/testbed2
> >>>>> [mishima_at_node03 testbed2]$ mpirun -np 8 -report-bindings
> > -cpus-per-proc
> >>> 4
> >>>>> -map-by socket mPre
> >>>>>
> >>>
> >
--------------------------------------------------------------------------
> >>>>> A request was made to bind to that would result in binding more
> >>>>> processes than cpus on a resource:
> >>>>>
> >>>>> Bind to: CORE
> >>>>> Node: node03
> >>>>> #processes: 2
> >>>>> #cpus: 1
> >>>>>
> >>>>> You can override this protection by adding the "overload-allowed"
> >>>>> option to your binding directive.
> >>>>>
> >>>
> >
--------------------------------------------------------------------------
> >>>>>
> >>>>>
> >>>>> [mishima_at_node03 testbed2]$ mpirun -np 8 -report-bindings
> > -cpus-per-proc
> >>> 4
> >>>>> mPre
> >>>>> [node03.cluster:18128] MCW rank 2 bound to socket 1[core 8[hwt 0]],
> >>> socket
> >>>>> 1[core 9[hwt 0]], socket 1[core 10[hwt 0]], s
> >>>>> ocket 1[core 11[hwt 0]]:
> >>>>>
[./././././././.][B/B/B/B/./././.][./././././././.][./././././././.]
> >>>>> [node03.cluster:18128] MCW rank 3 bound to socket 1[core 12[hwt
0]],
> >>> socket
> >>>>> 1[core 13[hwt 0]], socket 1[core 14[hwt 0]],
> >>>>> socket 1[core 15[hwt 0]]:
> >>>>>
[./././././././.][././././B/B/B/B][./././././././.][./././././././.]
> >>>>> [node03.cluster:18128] MCW rank 4 bound to socket 2[core 16[hwt
0]],
> >>> socket
> >>>>> 2[core 17[hwt 0]], socket 2[core 18[hwt 0]],
> >>>>> socket 2[core 19[hwt 0]]:
> >>>>>
[./././././././.][./././././././.][B/B/B/B/./././.][./././././././.]
> >>>>> [node03.cluster:18128] MCW rank 5 bound to socket 2[core 20[hwt
0]],
> >>> socket
> >>>>> 2[core 21[hwt 0]], socket 2[core 22[hwt 0]],
> >>>>> socket 2[core 23[hwt 0]]:
> >>>>>
[./././././././.][./././././././.][././././B/B/B/B][./././././././.]
> >>>>> [node03.cluster:18128] MCW rank 6 bound to socket 3[core 24[hwt
0]],
> >>> socket
> >>>>> 3[core 25[hwt 0]], socket 3[core 26[hwt 0]],
> >>>>> socket 3[core 27[hwt 0]]:
> >>>>>
[./././././././.][./././././././.][./././././././.][B/B/B/B/./././.]
> >>>>> [node03.cluster:18128] MCW rank 7 bound to socket 3[core 28[hwt
0]],
> >>> socket
> >>>>> 3[core 29[hwt 0]], socket 3[core 30[hwt 0]],
> >>>>> socket 3[core 31[hwt 0]]:
> >>>>>
[./././././././.][./././././././.][./././././././.][././././B/B/B/B]
> >>>>> [node03.cluster:18128] MCW rank 0 bound to socket 0[core 0[hwt 0]],
> >>> socket
> >>>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
> >>>>> cket 0[core 3[hwt 0]]:
> >>>>>
[B/B/B/B/./././.][./././././././.][./././././././.][./././././././.]
> >>>>> [node03.cluster:18128] MCW rank 1 bound to socket 0[core 4[hwt 0]],
> >>> socket
> >>>>> 0[core 5[hwt 0]], socket 0[core 6[hwt 0]], so
> >>>>> cket 0[core 7[hwt 0]]:
> >>>>>
[././././B/B/B/B][./././././././.][./././././././.][./././././././.]
> >>>>>
> >>>>> Regards,
> >>>>> Tetsuya Mishima
> >>>>>
> >>>>>> Fixed and scheduled to move to 1.7.4. Thanks again!
> >>>>>>
> >>>>>>
> >>>>>> On Nov 17, 2013, at 6:11 PM, Ralph Castain <rhc_at_[hidden]>
wrote:
> >>>>>>
> >>>>>> Thanks! That's precisely where I was going to look when I had
> > time :-)
> >>>>>>
> >>>>>> I'll update tomorrow.
> >>>>>> Ralph
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Sun, Nov 17, 2013 at 7:01 PM,
<tmishima_at_[hidden]>wrote:
> >>>>>>
> >>>>>>
> >>>>>> Hi Ralph,
> >>>>>>
> >>>>>> This is the continuous story of "Segmentation fault in oob_tcp.c
of
> >>>>>> openmpi-1.7.4a1r29646".
> >>>>>>
> >>>>>> I found the cause.
> >>>>>>
> >>>>>> Firstly, I noticed that your hostfile can work and mine can not.
> >>>>>>
> >>>>>> Your host file:
> >>>>>> cat hosts
> >>>>>> bend001 slots=12
> >>>>>>
> >>>>>> My host file:
> >>>>>> cat hosts
> >>>>>> node08
> >>>>>> node08
> >>>>>> ...(total 8 lines)
> >>>>>>
> >>>>>> I modified my script file to add "slots=1" to each line of my
> > hostfile
> >>>>>> just before launching mpirun. Then it worked.
> >>>>>>
> >>>>>> My host file(modified):
> >>>>>> cat hosts
> >>>>>> node08 slots=1
> >>>>>> node08 slots=1
> >>>>>> ...(total 8 lines)
> >>>>>>
> >>>>>> Secondary, I confirmed that there's a slight difference between
> >>>>>> orte/util/hostfile/hostfile.c of 1.7.3 and that of 1.7.4a1r29646.
> >>>>>>
> >>>>>> $ diff
> >>>>>>
> > hostfile.c.org ../../../../openmpi-1.7.3/orte/util/hostfile/hostfile.c
> >>>>>> 394,401c394,399
> >>>>>> < if (got_count) {
> >>>>>> < node->slots_given = true;
> >>>>>> < } else if (got_max) {
> >>>>>> < node->slots = node->slots_max;
> >>>>>> < node->slots_given = true;
> >>>>>> < } else {
> >>>>>> < /* should be set by obj_new, but just to be clear */
> >>>>>> < node->slots_given = false;
> >>>>>> ---
> >>>>>>> if (!got_count) {
> >>>>>>> if (got_max) {
> >>>>>>> node->slots = node->slots_max;
> >>>>>>> } else {
> >>>>>>> ++node->slots;
> >>>>>>> }
> >>>>>> ....
> >>>>>>
> >>>>>> Finally, I added the line 402 below just as a tentative trial.
> >>>>>> Then, it worked.
> >>>>>>
> >>>>>> cat -n orte/util/hostfile/hostfile.c:
> >>>>>> ...
> >>>>>> 394 if (got_count) {
> >>>>>> 395 node->slots_given = true;
> >>>>>> 396 } else if (got_max) {
> >>>>>> 397 node->slots = node->slots_max;
> >>>>>> 398 node->slots_given = true;
> >>>>>> 399 } else {
> >>>>>> 400 /* should be set by obj_new, but just to be clear */
> >>>>>> 401 node->slots_given = false;
> >>>>>> 402 ++node->slots; /* added by tmishima */
> >>>>>> 403 }
> >>>>>> ...
> >>>>>>
> >>>>>> Please fix the problem properly, because it's just based on my
> >>>>>> random guess. It's related to the treatment of hostfile where
slots
> >>>>>> information is not given.
> >>>>>>
> >>>>>> Regards,
> >>>>>> Tetsuya Mishima
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> users mailing list
> >>>>>> users_at_[hidden]
> >>>>>>
> >>>>>
> >>>
> >
http://www.open-mpi.org/mailman/listinfo.cgi/users_______________________________________________

> >
> >>>
> >>>>>
> >>>>>> users mailing list
> >>>>>>
users_at_[hidden]http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>>
> >>>>> _______________________________________________
> >>>>> users mailing list
> >>>>> users_at_[hidden]
> >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>
> >>>> _______________________________________________
> >>>> users mailing list
> >>>> users_at_[hidden]
> >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>
> >>> _______________________________________________
> >>> users mailing list
> >>> users_at_[hidden]
> >>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>
> >> _______________________________________________
> >> users mailing list
> >> users_at_[hidden]
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users