Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Using physical numbering in a rankfile
From: teng ma (tma_at_[hidden])
Date: 2012-02-02 12:22:48


I made a mistake in the previous reply. You can use two ways here like:
rank 0=host1 slot=0
rank 1=host1 slot=2
rank 2=host1 slot=4
rank 3=host1 slot=6
rank 4=host1 slot=1
rank 5=host1 slot=3
rank 6=host1 slot=5
rank 7=host1 slot=7

or

rank 0=host1 slot=0:0
rank 1=host1 slot=0:1
rank 2=host1 slot=0:2
rank 3=host1 slot=0:3
rank 4=host1 slot=1:0
rank 5=host1 slot=1:1
rank 6=host1 slot=1:2
rank 7=host1 slot=1:3

Teng

On Thu, Feb 2, 2012 at 12:17 PM, teng ma <tma_at_[hidden]> wrote:

> Just remove p in your rankfile like
>
> rank 0=host1 slot=0:0
> rank 1=host1 slot=0:2
> rank 2=host1 slot=0:4
> rank 3=host1 slot=0:6
> rank 4=host1 slot=1:1
> rank 5=host1 slot=1:3
> rank 6=host1 slot=1:5
> rank 7=host1 slot=1:7
>
> Teng
>
> 2012/2/2 François Tessier <francois.tessier_at_[hidden]>
>
>> Hello,
>>
>> I need to use a rankfile with openMPI 1.5.4 to do some tests on a basic
>> architecture. I'm using a node for which lstopo returns that :
>>
>> ----------------
>> Machine (24GB)
>> NUMANode L#0 (P#0 12GB)
>> Socket L#0 + L3 L#0 (8192KB)
>> L2 L#0 (256KB) + L1 L#0 (32KB) + Core L#0 + PU L#0 (P#0)
>> L2 L#1 (256KB) + L1 L#1 (32KB) + Core L#1 + PU L#1 (P#2)
>> L2 L#2 (256KB) + L1 L#2 (32KB) + Core L#2 + PU L#2 (P#4)
>> L2 L#3 (256KB) + L1 L#3 (32KB) + Core L#3 + PU L#3 (P#6)
>> HostBridge L#0
>> PCIBridge
>> PCI 8086:10c9
>> Net L#0 "eth0"
>> PCI 8086:10c9
>> Net L#1 "eth1"
>> PCIBridge
>> PCI 15b3:673c
>> Net L#2 "ib0"
>> Net L#3 "ib1"
>> OpenFabrics L#4 "mlx4_0"
>> PCIBridge
>> PCI 102b:0522
>> PCI 8086:3a22
>> Block L#5 "sda"
>> Block L#6 "sdb"
>> Block L#7 "sdc"
>> Block L#8 "sdd"
>> NUMANode L#1 (P#1 12GB) + Socket L#1 + L3 L#1 (8192KB)
>> L2 L#4 (256KB) + L1 L#4 (32KB) + Core L#4 + PU L#4 (P#1)
>> L2 L#5 (256KB) + L1 L#5 (32KB) + Core L#5 + PU L#5 (P#3)
>> L2 L#6 (256KB) + L1 L#6 (32KB) + Core L#6 + PU L#6 (P#5)
>> L2 L#7 (256KB) + L1 L#7 (32KB) + Core L#7 + PU L#7 (P#7)
>> ----------------
>>
>> And I would like to use the physical numbering. To do that, I created a
>> rankfile like this :
>>
>> rank 0=host1 slot=p0:0
>> rank 1=host1 slot=p0:2
>> rank 2=host1 slot=p0:4
>> rank 3=host1 slot=p0:6
>> rank 4=host1 slot=p1:1
>> rank 5=host1 slot=p1:3
>> rank 6=host1 slot=p1:5
>> rank 7=host1 slot=p1:7
>>
>> But when I run my job with "*mpiexec -np 8 --rankfile rankfile ./foo*",
>> I encounter this error :
>>
>> * Specified slot list: p0:4
>> Error: Not found
>>
>> This could mean that a non-existent processor was specified, or
>> that the specification had improper syntax.*
>>
>>
>> Do you know what I did wrong?
>>
>> Best regards,
>>
>> François
>>
>> --
>> ___________________
>> François TESSIER
>> PhD Student at University of Bordeaux
>> Tel : 0033.5.24.57.41.52francois.tessier_at_[hidden]
>>
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
>
> --
> | Teng Ma Univ. of Tennessee |
> | tma_at_[hidden] Knoxville, TN |
> | http://web.eecs.utk.edu/~tma/ |
>

-- 
| Teng Ma          Univ. of Tennessee |
| tma_at_[hidden]        Knoxville, TN |
| http://web.eecs.utk.edu/~tma/       |