Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] openmpi tar.gz for 1.6.1 or 1.6.2
From: Ralph Castain (rhc_at_[hidden])
Date: 2012-07-16 19:08:21


Or you could just do:

mpirun --slot-list 0-3 -np 4 hostname

That will put the four procs on the cpu numbers 0-3, which should all be on the first socket

On Jul 16, 2012, at 3:23 PM, Dominik Goeddeke wrote:

> in the "old" 1.4.x and 1.5.x, I achieved this by using rankfiles (see FAQ), and it worked very well. With these versions, --byslot etc. didn't work for me, I always needed the rankfiles. I haven't tried the overhauled "convenience wrappers" in 1.6 that you are using for this feature yet, but I see no reason why the "old" way should not work, although it requires some shell magic if rankfiles are to be generated automatically from e.g. PBS or SLURM node lists.
>
> Dominik
>
> On 07/17/2012 12:13 AM, Anne M. Hammond wrote:
>> There are 2 physical processors, each with 4 cores (no hyperthreading).
>>
>> I want to instruct openmpi to run only on the first processor, using 4 cores.
>>
>>
>> [hammond_at_node48 ~]$ cat /proc/cpuinfo
>> processor : 0
>> vendor_id : AuthenticAMD
>> cpu family : 16
>> model : 4
>> model name : Quad-Core AMD Opteron(tm) Processor 2376
>> stepping : 2
>> cpu MHz : 2311.694
>> cache size : 512 KB
>> physical id : 0
>> siblings : 4
>> core id : 0
>> cpu cores : 4
>> apicid : 0
>> initial apicid : 0
>> fpu : yes
>> fpu_exception : yes
>> cpuid level : 5
>> wp : yes
>> flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt npt lbrv svm_lock nrip_save
>> bogomips : 4623.38
>> TLB size : 1024 4K pages
>> clflush size : 64
>> cache_alignment : 64
>> address sizes : 48 bits physical, 48 bits virtual
>> power management: ts ttp tm stc 100mhzsteps hwpstate
>>
>> processor : 1
>> vendor_id : AuthenticAMD
>> cpu family : 16
>> model : 4
>> model name : Quad-Core AMD Opteron(tm) Processor 2376
>> stepping : 2
>> cpu MHz : 2311.694
>> cache size : 512 KB
>> physical id : 0
>> siblings : 4
>> core id : 1
>> cpu cores : 4
>> apicid : 1
>> initial apicid : 1
>> fpu : yes
>> fpu_exception : yes
>> cpuid level : 5
>> wp : yes
>> flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt npt lbrv svm_lock nrip_save
>> bogomips : 4623.17
>> TLB size : 1024 4K pages
>> clflush size : 64
>> cache_alignment : 64
>> address sizes : 48 bits physical, 48 bits virtual
>> power management: ts ttp tm stc 100mhzsteps hwpstate
>>
>> processor : 2
>> vendor_id : AuthenticAMD
>> cpu family : 16
>> model : 4
>> model name : Quad-Core AMD Opteron(tm) Processor 2376
>> stepping : 2
>> cpu MHz : 2311.694
>> cache size : 512 KB
>> physical id : 0
>> siblings : 4
>> core id : 2
>> cpu cores : 4
>> apicid : 2
>> initial apicid : 2
>> fpu : yes
>> fpu_exception : yes
>> cpuid level : 5
>> wp : yes
>> flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt npt lbrv svm_lock nrip_save
>> bogomips : 4623.19
>> TLB size : 1024 4K pages
>> clflush size : 64
>> cache_alignment : 64
>> address sizes : 48 bits physical, 48 bits virtual
>> power management: ts ttp tm stc 100mhzsteps hwpstate
>>
>> processor : 3
>> vendor_id : AuthenticAMD
>> cpu family : 16
>> model : 4
>> model name : Quad-Core AMD Opteron(tm) Processor 2376
>> stepping : 2
>> cpu MHz : 2311.694
>> cache size : 512 KB
>> physical id : 0
>> siblings : 4
>> core id : 3
>> cpu cores : 4
>> apicid : 3
>> initial apicid : 3
>> fpu : yes
>> fpu_exception : yes
>> cpuid level : 5
>> wp : yes
>> flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt npt lbrv svm_lock nrip_save
>> bogomips : 4623.16
>> TLB size : 1024 4K pages
>> clflush size : 64
>> cache_alignment : 64
>> address sizes : 48 bits physical, 48 bits virtual
>> power management: ts ttp tm stc 100mhzsteps hwpstate
>>
>> processor : 4
>> vendor_id : AuthenticAMD
>> cpu family : 16
>> model : 4
>> model name : Quad-Core AMD Opteron(tm) Processor 2376
>> stepping : 2
>> cpu MHz : 2311.694
>> cache size : 512 KB
>> physical id : 1
>> siblings : 4
>> core id : 0
>> cpu cores : 4
>> apicid : 4
>> initial apicid : 4
>> fpu : yes
>> fpu_exception : yes
>> cpuid level : 5
>> wp : yes
>> flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt npt lbrv svm_lock nrip_save
>> bogomips : 4623.16
>> TLB size : 1024 4K pages
>> clflush size : 64
>> cache_alignment : 64
>> address sizes : 48 bits physical, 48 bits virtual
>> power management: ts ttp tm stc 100mhzsteps hwpstate
>>
>> processor : 5
>> vendor_id : AuthenticAMD
>> cpu family : 16
>> model : 4
>> model name : Quad-Core AMD Opteron(tm) Processor 2376
>> stepping : 2
>> cpu MHz : 2311.694
>> cache size : 512 KB
>> physical id : 1
>> siblings : 4
>> core id : 1
>> cpu cores : 4
>> apicid : 5
>> initial apicid : 5
>> fpu : yes
>> fpu_exception : yes
>> cpuid level : 5
>> wp : yes
>> flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt npt lbrv svm_lock nrip_save
>> bogomips : 4623.16
>> TLB size : 1024 4K pages
>> clflush size : 64
>> cache_alignment : 64
>> address sizes : 48 bits physical, 48 bits virtual
>> power management: ts ttp tm stc 100mhzsteps hwpstate
>>
>> processor : 6
>> vendor_id : AuthenticAMD
>> cpu family : 16
>> model : 4
>> model name : Quad-Core AMD Opteron(tm) Processor 2376
>> stepping : 2
>> cpu MHz : 2311.694
>> cache size : 512 KB
>> physical id : 1
>> siblings : 4
>> core id : 2
>> cpu cores : 4
>> apicid : 6
>> initial apicid : 6
>> fpu : yes
>> fpu_exception : yes
>> cpuid level : 5
>> wp : yes
>> flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt npt lbrv svm_lock nrip_save
>> bogomips : 4623.17
>> TLB size : 1024 4K pages
>> clflush size : 64
>> cache_alignment : 64
>> address sizes : 48 bits physical, 48 bits virtual
>> power management: ts ttp tm stc 100mhzsteps hwpstate
>>
>> processor : 7
>> vendor_id : AuthenticAMD
>> cpu family : 16
>> model : 4
>> model name : Quad-Core AMD Opteron(tm) Processor 2376
>> stepping : 2
>> cpu MHz : 2311.694
>> cache size : 512 KB
>> physical id : 1
>> siblings : 4
>> core id : 3
>> cpu cores : 4
>> apicid : 7
>> initial apicid : 7
>> fpu : yes
>> fpu_exception : yes
>> cpuid level : 5
>> wp : yes
>> flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt npt lbrv svm_lock nrip_save
>> bogomips : 4623.18
>> TLB size : 1024 4K pages
>> clflush size : 64
>> cache_alignment : 64
>> address sizes : 48 bits physical, 48 bits virtual
>> power management: ts ttp tm stc 100mhzsteps hwpstate
>>
>>
>> On Jul 16, 2012, at 4:09 PM, Elken, Tom wrote:
>>
>>> Anne,
>>>
>>> output from "cat /proc/cpuinfo" on your node "hostname" may help those trying to answer.
>>>
>>> -Tom
>>>
>>>> -----Original Message-----
>>>> From: users-bounces_at_[hidden] [mailto:users-bounces_at_[hidden]] On
>>>> Behalf Of Ralph Castain
>>>> Sent: Monday, July 16, 2012 2:47 PM
>>>> To: Open MPI Users
>>>> Subject: Re: [OMPI users] openmpi tar.gz for 1.6.1 or 1.6.2
>>>>
>>>> I gather there are two sockets on this node? So the second cmd line is equivalent
>>>> to leaving "num-sockets" off of the cmd line?
>>>>
>>>> I haven't tried what you are doing, so it is quite possible this is a bug.
>>>>
>>>>
>>>> On Jul 16, 2012, at 1:49 PM, Anne M. Hammond wrote:
>>>>
>>>>> Thanks!
>>>>>
>>>>> Built the latest snapshot. Still getting an error when trying to run
>>>>> on only one socket (see below): Is there a workaround?
>>>>>
>>>>> [hammond_at_node65 bin]$ ./mpirun -np 4 --num-sockets 1 --npersocket 4
>>>>> hostname
>>>>> ----------------------------------------------------------------------
>>>>> ---- An invalid physical processor ID was returned when attempting to
>>>>> bind an MPI process to a unique processor.
>>>>>
>>>>> This usually means that you requested binding to more processors than
>>>>> exist (e.g., trying to bind N MPI processes to M processors, where N >
>>>>> M). Double check that you have enough unique processors for all the
>>>>> MPI processes that you are launching on this host.
>>>>>
>>>>> You job will now abort.
>>>>> ----------------------------------------------------------------------
>>>>> ----
>>>>> ----------------------------------------------------------------------
>>>>> ---- mpirun was unable to start the specified application as it
>>>>> encountered an error:
>>>>>
>>>>> Error name: Fatal
>>>>> Node: node65.cl.corp.com
>>>>>
>>>>> when attempting to start process rank 0.
>>>>> ----------------------------------------------------------------------
>>>>> ----
>>>>> 4 total processes failed to start
>>>>>
>>>>>
>>>>> [hammond_at_node65 bin]$ ./mpirun -np 4 --num-sockets 2 --npersocket 4
>>>>> hostname node65.cl.corp.com node65.cl.corp.com node65.cl.corp.com
>>>>> node65.cl.corp.com
>>>>> [hammond_at_node65 bin]$
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Jul 16, 2012, at 12:56 PM, Ralph Castain wrote:
>>>>>
>>>>>> Jeff is at the MPI Forum this week, so his answers will be delayed. Last I
>>>> heard, it was close, but no specific date has been set.
>>>>>>
>>>>>>
>>>>>> On Jul 16, 2012, at 11:49 AM, Michael E. Thomadakis wrote:
>>>>>>
>>>>>>> When is the expected date for the official 1.6.1 (or 1.6.2 ?) to be available ?
>>>>>>>
>>>>>>> mike
>>>>>>>
>>>>>>> On 07/16/2012 01:44 PM, Ralph Castain wrote:
>>>>>>>> You can get it here:
>>>>>>>>
>>>>>>>> http://www.open-mpi.org/nightly/v1.6/
>>>>>>>>
>>>>>>>> On Jul 16, 2012, at 10:22 AM, Anne M. Hammond wrote:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> For benchmarking, we would like to use openmpi with
>>>>>>>>> --num-sockets 1
>>>>>>>>>
>>>>>>>>> This fails in 1.6, but Bug Report #3119 indicates it is changed in
>>>>>>>>> 1.6.1.
>>>>>>>>>
>>>>>>>>> Is 1.6.1 or 1.6.2 available in tar.gz form?
>>>>>>>>>
>>>>>>>>> Thanks!
>>>>>>>>> Anne
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> users_at_[hidden]
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> users_at_[hidden]
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> users_at_[hidden]
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> users_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>
>>>>>
>>>>> Anne M. Hammond - Systems / Network Administration - Tech-X Corp
>>>>> hammond_at_txcorp.com 720-974-1840
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>> Anne M. Hammond - Systems / Network Administration - Tech-X Corp
>> hammond_at_txcorp.com 720-974-1840
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jun.-Prof. Dr. Dominik Göddeke
> Hardware-orientierte Numerik für große Systeme
> Institut für Angewandte Mathematik (LS III)
> Fakultät für Mathematik, Technische Universität Dortmund
> http://www.mathematik.tu-dortmund.de/~goeddeke
> Tel. +49-(0)231-755-7218 Fax +49-(0)231-755-5933
> --
> Sent from my old-fashioned computer and not from a mobile device.
> I proudly boycott 24/7 availability.
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users