Open MPI logo

Hardware Locality Users' Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [hwloc-users] howloc with scalemp
From: Brock Palen (brockp_at_[hidden])
Date: 2010-04-07 16:46:53


Brice Goglin wrote:

> Brock Palen wrote:
>> has anyone done work with hwloc on scalemp systems? They provide
>> their own tool numabind, but we are looking for a more generic
>> solution to process placement and control that works well inside our
>> MPI library (openMPI in most cases).
>>
>> Any input on this would be great!
>
> Hello Brock,
>
>> From what I remember, ScaleMP uses an hypervisor on each node that
> virtually merges all of them into a fake big shared-memory machine.
> Then
> a vanilla Linux kernel runs on top of it. So hwloc should just see
> regular cores and NUMA node information, assuming the virtual "merged"
> hardware reports all necessary information to the OS.
>

running lstopo 0.9.3 it appears that howloc does see the extra layer
of complexity:

[brockp_at_nyx0809 INTEL]$ lstopo -
System(79GB)
   Misc0
     Node#0(10GB) + Socket#1 + L3(8192KB)
       L2(256KB) + L1(32KB) + Core#0 + P#0
       L2(256KB) + L1(32KB) + Core#1 + P#1
       L2(256KB) + L1(32KB) + Core#2 + P#2
       L2(256KB) + L1(32KB) + Core#3 + P#3
     Node#1(10GB) + Socket#0 + L3(8192KB)
       L2(256KB) + L1(32KB) + Core#0 + P#4
       L2(256KB) + L1(32KB) + Core#1 + P#5
       L2(256KB) + L1(32KB) + Core#2 + P#6
       L2(256KB) + L1(32KB) + Core#3 + P#7
   Misc0
     Node#2(10GB) + Socket#3 + L3(8192KB)
       L2(256KB) + L1(32KB) + Core#0 + P#8
       L2(256KB) + L1(32KB) + Core#1 + P#9
       L2(256KB) + L1(32KB) + Core#2 + P#10
       L2(256KB) + L1(32KB) + Core#3 + P#11
     Node#3(10GB) + Socket#2 + L3(8192KB)
       L2(256KB) + L1(32KB) + Core#0 + P#12
       L2(256KB) + L1(32KB) + Core#1 + P#13
       L2(256KB) + L1(32KB) + Core#2 + P#14
       L2(256KB) + L1(32KB) + Core#3 + P#15
   Misc0
     Node#4(10GB) + Socket#5 + L3(8192KB)
       L2(256KB) + L1(32KB) + Core#0 + P#16
       L2(256KB) + L1(32KB) + Core#1 + P#17
       L2(256KB) + L1(32KB) + Core#2 + P#18
       L2(256KB) + L1(32KB) + Core#3 + P#19
     Node#5(10GB) + Socket#4 + L3(8192KB)
       L2(256KB) + L1(32KB) + Core#0 + P#20
       L2(256KB) + L1(32KB) + Core#1 + P#21
       L2(256KB) + L1(32KB) + Core#2 + P#22
       L2(256KB) + L1(32KB) + Core#3 + P#23
   Misc0
     Node#6(10GB) + Socket#7 + L3(8192KB)
       L2(256KB) + L1(32KB) + Core#0 + P#24
       L2(256KB) + L1(32KB) + Core#1 + P#25
       L2(256KB) + L1(32KB) + Core#2 + P#26
       L2(256KB) + L1(32KB) + Core#3 + P#27
     Node#7(10GB) + Socket#6 + L3(8192KB)
       L2(256KB) + L1(32KB) + Core#0 + P#28
       L2(256KB) + L1(32KB) + Core#1 + P#29
       L2(256KB) + L1(32KB) + Core#2 + P#30
       L2(256KB) + L1(32KB) + Core#3 + P#31

I don't know why they are all labeled Misc0 but it does see the extra
layer.

If you want other information let me know.

> There's a bit of ScaleMP code in the Linux kernel, but it does pretty
> much nothing, it does not seem to add anything to /proc or /sys for
> instance. So I am not sure hwloc could get some specialized
> knowledge of
> ScaleMP machines. Maybe their custom numabind tool knows that ScaleMP
> machines only works on machines with some well-defined
> types/counts/numbering of processors and NUMA nodes, and thus uses
> this
> information to group sockets/NUMA-nodes depending on their physical
> distance.
>
> Brice
>
> _______________________________________________
> hwloc-users mailing list
> hwloc-users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>
>