Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] 1.6.2 affinity failures
From: Ralph Castain (rhc_at_[hidden])
Date: 2012-12-20 10:46:53


Hmmm....I'll see what I can do about the error message. I don't think there is much in 1.6 I can do, but in 1.7 I could generate an appropriate error message as we have a way to check the topologies.

On Dec 20, 2012, at 7:11 AM, Brock Palen <brockp_at_[hidden]> wrote:

> Ralph,
>
> Thanks for the info,
> That said I found the problem, one of the new nodes, had Hyperthreading on, and the rest didn't so all the nodes didn't match. A quick
>
> pdsh lstopo | dshbak -c
>
> Uncovered the one different node. The error just didn't give me a clue to that being the cause, which was very odd:
>
> Correct node:
> [brockp_at_nyx0930 ~]$ lstopo
> Machine (64GB)
> NUMANode L#0 (P#0 32GB) + Socket L#0 + L3 L#0 (20MB)
> L2 L#0 (256KB) + L1 L#0 (32KB) + Core L#0 + PU L#0 (P#0)
> L2 L#1 (256KB) + L1 L#1 (32KB) + Core L#1 + PU L#1 (P#1)
> L2 L#2 (256KB) + L1 L#2 (32KB) + Core L#2 + PU L#2 (P#2)
> L2 L#3 (256KB) + L1 L#3 (32KB) + Core L#3 + PU L#3 (P#3)
> L2 L#4 (256KB) + L1 L#4 (32KB) + Core L#4 + PU L#4 (P#4)
> L2 L#5 (256KB) + L1 L#5 (32KB) + Core L#5 + PU L#5 (P#5)
> L2 L#6 (256KB) + L1 L#6 (32KB) + Core L#6 + PU L#6 (P#6)
> L2 L#7 (256KB) + L1 L#7 (32KB) + Core L#7 + PU L#7 (P#7)
> NUMANode L#1 (P#1 32GB) + Socket L#1 + L3 L#1 (20MB)
> L2 L#8 (256KB) + L1 L#8 (32KB) + Core L#8 + PU L#8 (P#8)
> L2 L#9 (256KB) + L1 L#9 (32KB) + Core L#9 + PU L#9 (P#9)
> L2 L#10 (256KB) + L1 L#10 (32KB) + Core L#10 + PU L#10 (P#10)
> L2 L#11 (256KB) + L1 L#11 (32KB) + Core L#11 + PU L#11 (P#11)
> L2 L#12 (256KB) + L1 L#12 (32KB) + Core L#12 + PU L#12 (P#12)
> L2 L#13 (256KB) + L1 L#13 (32KB) + Core L#13 + PU L#13 (P#13)
> L2 L#14 (256KB) + L1 L#14 (32KB) + Core L#14 + PU L#14 (P#14)
> L2 L#15 (256KB) + L1 L#15 (32KB) + Core L#15 + PU L#15 (P#15)
>
>
> Bad node:
> [brockp_at_nyx0936 ~]$ lstopo
> Machine (64GB)
> NUMANode L#0 (P#0 32GB) + Socket L#0 + L3 L#0 (20MB)
> L2 L#0 (256KB) + L1 L#0 (32KB) + Core L#0
> PU L#0 (P#0)
> PU L#1 (P#16)
> L2 L#1 (256KB) + L1 L#1 (32KB) + Core L#1
> PU L#2 (P#1)
> PU L#3 (P#17)
> L2 L#2 (256KB) + L1 L#2 (32KB) + Core L#2
> PU L#4 (P#2)
> PU L#5 (P#18)
> L2 L#3 (256KB) + L1 L#3 (32KB) + Core L#3
> PU L#6 (P#3)
> PU L#7 (P#19)
> L2 L#4 (256KB) + L1 L#4 (32KB) + Core L#4
> PU L#8 (P#4)
> PU L#9 (P#20)
> L2 L#5 (256KB) + L1 L#5 (32KB) + Core L#5
> PU L#10 (P#5)
> PU L#11 (P#21)
> L2 L#6 (256KB) + L1 L#6 (32KB) + Core L#6
> PU L#12 (P#6)
> PU L#13 (P#22)
> L2 L#7 (256KB) + L1 L#7 (32KB) + Core L#7
> PU L#14 (P#7)
> PU L#15 (P#23)
> NUMANode L#1 (P#1 32GB) + Socket L#1 + L3 L#1 (20MB)
> L2 L#8 (256KB) + L1 L#8 (32KB) + Core L#8
> PU L#16 (P#8)
> PU L#17 (P#24)
> L2 L#9 (256KB) + L1 L#9 (32KB) + Core L#9
> PU L#18 (P#9)
> PU L#19 (P#25)
> L2 L#10 (256KB) + L1 L#10 (32KB) + Core L#10
> PU L#20 (P#10)
> PU L#21 (P#26)
> L2 L#11 (256KB) + L1 L#11 (32KB) + Core L#11
> PU L#22 (P#11)
> PU L#23 (P#27)
> L2 L#12 (256KB) + L1 L#12 (32KB) + Core L#12
> PU L#24 (P#12)
> PU L#25 (P#28)
> L2 L#13 (256KB) + L1 L#13 (32KB) + Core L#13
> PU L#26 (P#13)
> PU L#27 (P#29)
> L2 L#14 (256KB) + L1 L#14 (32KB) + Core L#14
> PU L#28 (P#14)
> PU L#29 (P#30)
> L2 L#15 (256KB) + L1 L#15 (32KB) + Core L#15
> PU L#30 (P#15)
> PU L#31 (P#31)
>
> Once I removed that node from the pool the error went away, and using bind-to-core and cpus-per-rank worked.
>
> I don't see how an error message of the sort given would ever lead me to find a node with 'more' cores, even if fake, I was looking for a node that had a bad socket or wrong part.
>
>
> Brock Palen
> www.umich.edu/~brockp
> CAEN Advanced Computing
> brockp_at_[hidden]
> (734)936-1985
>
>
>
> On Dec 19, 2012, at 9:08 PM, Ralph Castain wrote:
>
>> I'm afraid these are both known problems in the 1.6.2 release. I believe we fixed npersocket in 1.6.3, though you might check to be sure. On the large-scale issue, cpus-per-rank well might fail under those conditions. The algorithm in the 1.6 series hasn't seen much use, especially at scale.
>>
>> In fact, cpus-per-rank has somewhat fallen by the wayside recently due to apparent lack of interest. I'm restoring it for the 1.7 series over the holiday (currently doesn't work in 1.7 or trunk).
>>
>>
>> On Dec 19, 2012, at 4:34 PM, Brock Palen <brockp_at_[hidden]> wrote:
>>
>>> Using openmpi 1.6.2 with intel 13.0 though the problem not specific to the compiler.
>>>
>>> Using two 12 core 2 socket nodes,
>>>
>>> mpirun -np 4 -npersocket 2 uptime
>>> --------------------------------------------------------------------------
>>> Your job has requested a conflicting number of processes for the
>>> application:
>>>
>>> App: uptime
>>> number of procs: 4
>>>
>>> This is more processes than we can launch under the following
>>> additional directives and conditions:
>>>
>>> number of sockets: 0
>>> npersocket: 2
>>>
>>>
>>> Any idea why this wouldn't work?
>>>
>>> Another problem the following does what I expect, two 2 socket 8 core sockets. 16 total cores/node.
>>>
>>> mpirun -np 8 -npernode 4 -bind-to-core -cpus-per-rank 4 hwloc-bind --get
>>> 0x0000000f
>>> 0x0000000f
>>> 0x000000f0
>>> 0x000000f0
>>> 0x00000f00
>>> 0x00000f00
>>> 0x0000f000
>>> 0x0000f000
>>>
>>> But fails at large scale:
>>>
>>> mpirun -np 276 -npernode 4 -bind-to-core -cpus-per-rank 4 hwloc-bind --get
>>>
>>> --------------------------------------------------------------------------
>>> An invalid physical processor ID was returned when attempting to bind
>>> an MPI process to a unique processor.
>>>
>>> This usually means that you requested binding to more processors than
>>> exist (e.g., trying to bind N MPI processes to M processors, where N >
>>> M). Double check that you have enough unique processors for all the
>>> MPI processes that you are launching on this host.
>>> You job will now abort.
>>> --------------------------------------------------------------------------
>>>
>>>
>>>
>>> Brock Palen
>>> www.umich.edu/~brockp
>>> CAEN Advanced Computing
>>> brockp_at_[hidden]
>>> (734)936-1985
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users