Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] OMPI 1.6 affinity fixes: PLEASE TEST
From: Brice Goglin (Brice.Goglin_at_[hidden])
Date: 2012-05-30 09:36:38


Your /proc/cpuinfo output (filtered below) looks like only two sockets
(physical ids 0 and 1), with one core each (cpu cores=1, core id=0),
with hyperthreading (siblings=2). So lstopo looks good.
E5-2650 is supposed to have 8 cores. I assume you use Linux
cgroups/cpusets to restrict the available cores. The missconfiguration
may be there.
Brice

Le 30/05/2012 15:14, Mike Dubman a écrit :
> or, lstopo lies (Im not using the latest hwloc but one which comes
> with distro).
> The machine has two dual-code sockets, total 4 physical cores:
> processor : 0
> physical id : 0
> siblings : 2
> core id : 0
> cpu cores : 1
>
> processor : 1
> physical id : 1
> siblings : 2
> core id : 0
> cpu cores : 1
>
> processor : 2
> physical id : 0
> siblings : 2
> core id : 0
> cpu cores : 1
>
> processor : 3
> physical id : 1
> siblings : 2
> core id : 0
> cpu cores : 1
>
>
>
> On Wed, May 30, 2012 at 3:40 PM, Ralph Castain <rhc_at_[hidden]
> <mailto:rhc_at_[hidden]>> wrote:
>
> Hmmm...well, from what I see, mpirun was actually giving you the
> right answer! I only see TWO cores on each node, yet you told it
> to bind FOUR processes on each node, each proc to be bound to a
> unique core.
>
> The error message was correct - there are not enough cores on
> those nodes to do what you requested.
>
>
> On May 30, 2012, at 6:19 AM, Mike Dubman wrote:
>
>> attached.
>>
>> On Wed, May 30, 2012 at 2:32 PM, Jeff Squyres <jsquyres_at_[hidden]
>> <mailto:jsquyres_at_[hidden]>> wrote:
>>
>> On May 30, 2012, at 7:20 AM, Jeff Squyres wrote:
>>
>> >> $hwloc-ls --of console
>> >> Machine (32GB)
>> >> NUMANode L#0 (P#0 16GB) + Socket L#0 + L3 L#0 (20MB) + L2
>> L#0 (256KB) + L1 L#0 (32KB) + Core L#0
>> >> PU L#0 (P#0)
>> >> PU L#1 (P#2)
>> >> NUMANode L#1 (P#1 16GB) + Socket L#1 + L3 L#1 (20MB) + L2
>> L#1 (256KB) + L1 L#1 (32KB) + Core L#1
>> >> PU L#2 (P#1)
>> >> PU L#3 (P#3)
>> >
>> > Is this hwloc output exactly the same on both nodes?
>>
>>
>> More specifically, can you send the lstopo xml output from
>> each of the 2 nodes you ran on?
>>
>> --
>> Jeff Squyres
>> jsquyres_at_[hidden] <mailto:jsquyres_at_[hidden]>
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden] <mailto:devel_at_[hidden]>
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>> <lstopo-out.tbz>_______________________________________________
>>
>> devel mailing list
>> devel_at_[hidden] <mailto:devel_at_[hidden]>
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden] <mailto:devel_at_[hidden]>
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel