Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] OMPI 1.6 affinity fixes: PLEASE TEST
From: Mike Dubman (mike.ompi_at_[hidden])
Date: 2012-05-30 09:14:47


or, lstopo lies (Im not using the latest hwloc but one which comes with
distro).
The machine has two dual-code sockets, total 4 physical cores:
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 45
model name : Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz
stepping : 7
cpu MHz : 1200.000
cache size : 20480 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 1
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx
pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xto
pology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx
est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 x2apic popcnt aes xsave avx
lahf_lm ida arat epb xsaveopt pln pts dts tpr_shadow vnmi fle
xpriority ept vpid
bogomips : 4000.12
clflush size : 64
cache_alignment : 64
address sizes : 46 bits physical, 48 bits virtual
power management:
processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 45
model name : Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz
stepping : 7
cpu MHz : 1200.000
cache size : 20480 KB
physical id : 1
siblings : 2
core id : 0
cpu cores : 1
apicid : 32
initial apicid : 32
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx
pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xto
pology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx
est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 x2apic popcnt aes xsave avx
lahf_lm ida arat epb xsaveopt pln pts dts tpr_shadow vnmi fle
xpriority ept vpid
bogomips : 3999.44
clflush size : 64
cache_alignment : 64
address sizes : 46 bits physical, 48 bits virtual
power management:
processor : 2
vendor_id : GenuineIntel
cpu family : 6
model : 45
model name : Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz
stepping : 7
cpu MHz : 1200.000
cache size : 20480 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 1
apicid : 1
initial apicid : 1
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx
pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xto
pology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx
est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 x2apic popcnt aes xsave avx
lahf_lm ida arat epb xsaveopt pln pts dts tpr_shadow vnmi fle
xpriority ept vpid
bogomips : 3999.42
clflush size : 64
cache_alignment : 64
address sizes : 46 bits physical, 48 bits virtual
power management:
processor : 3
vendor_id : GenuineIntel
cpu family : 6
model : 45
model name : Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz
stepping : 7
cpu MHz : 1200.000
cache size : 20480 KB
physical id : 1
siblings : 2
core id : 0
cpu cores : 1
apicid : 33
initial apicid : 33
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx
pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology
nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2
ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 x2apic popcnt aes xsave avx lahf_lm
ida arat epb xsaveopt pln pts dts tpr_shadow vnmi flexpriority ept vpid
bogomips : 3999.44
clflush size : 64
cache_alignment : 64
address sizes : 46 bits physical, 48 bits virtual
power management:

On Wed, May 30, 2012 at 3:40 PM, Ralph Castain <rhc_at_[hidden]> wrote:

> Hmmm...well, from what I see, mpirun was actually giving you the right
> answer! I only see TWO cores on each node, yet you told it to bind FOUR
> processes on each node, each proc to be bound to a unique core.
>
> The error message was correct - there are not enough cores on those nodes
> to do what you requested.
>
>
> On May 30, 2012, at 6:19 AM, Mike Dubman wrote:
>
> attached.
>
> On Wed, May 30, 2012 at 2:32 PM, Jeff Squyres <jsquyres_at_[hidden]> wrote:
>
>> On May 30, 2012, at 7:20 AM, Jeff Squyres wrote:
>>
>> >> $hwloc-ls --of console
>> >> Machine (32GB)
>> >> NUMANode L#0 (P#0 16GB) + Socket L#0 + L3 L#0 (20MB) + L2 L#0 (256KB)
>> + L1 L#0 (32KB) + Core L#0
>> >> PU L#0 (P#0)
>> >> PU L#1 (P#2)
>> >> NUMANode L#1 (P#1 16GB) + Socket L#1 + L3 L#1 (20MB) + L2 L#1 (256KB)
>> + L1 L#1 (32KB) + Core L#1
>> >> PU L#2 (P#1)
>> >> PU L#3 (P#3)
>> >
>> > Is this hwloc output exactly the same on both nodes?
>>
>>
>> More specifically, can you send the lstopo xml output from each of the 2
>> nodes you ran on?
>>
>> --
>> Jeff Squyres
>> jsquyres_at_[hidden]
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>
> <lstopo-out.tbz>_______________________________________________
>
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>