Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: [OMPI devel] Fwd: [OMPI users] OpenMPI with openib partitions
From: Matt Burgess (burgess.matt_at_[hidden])
Date: 2008-10-07 09:46:48


Lenny,

Thanks for the info. It doesn't seem to be be working still. My command line
is:

/opt/openmpi-ib/1.2.6/bin/mpirun -np 2 -H d2-ib,d3-ib -mca btl openib,self
-mca btl_openib_of_pkey_val 33033 /cluster/pallas/x86_64-ib/IMB-MPI1

I don't have a "/sys/class/infiniband/mthca0/ports/1/pkeys/" but I do have
"/sys/class/infiniband/mlx4_0/ports/1/pkeys/". It's contents are:

0 106 114 122 16 24 32 40 49 57 65 73 81 9 98
1 107 115 123 17 25 33 41 5 58 66 74 82 90 99
10 108 116 124 18 26 34 42 50 59 67 75 83 91
100 109 117 125 19 27 35 43 51 6 68 76 84 92
101 11 118 126 2 28 36 44 52 60 69 77 85 93
102 110 119 127 20 29 37 45 53 61 7 78 86 94
103 111 12 13 21 3 38 46 54 62 70 79 87 95
104 112 120 14 22 30 39 47 55 63 71 8 88 96
105 113 121 15 23 31 4 48 56 64 72 80 89 97

We aren't using the opensm, but voltaire's SM on a 2012 switch.

Thanks again,
Matt

On Tue, Oct 7, 2008 at 9:37 AM, Lenny Verkhovsky <lenny.verkhovsky_at_[hidden]
> wrote:

> Hi Matt,
>
> It seems that the right way to do it is the fallowing:
>
> -mca btl openib,self -mca btl_openib_ib_pkey_val 33033
>
> when the value is a decimal number of the pkey, in your case 0x8109 =
> 33033, and no need for btl_openib_ib_pkey_ix value.
>
> ex.
>
> mpirun -np 2 -H witch2,witch3 -mca btl openib,self -mca
> btl_openib_ib_pkey_val 32769 ./mpi_p1_4_1_2 -t lt
> LT (2) (size min max avg) 1 3.511429 3.511429 3.511429
>
> if it's not working check cat /sys/class/infiniband/mthca0/ports/1/pkeys/*
> for pkeys ans SM, maybe it's a setup.
>
> Pasha is currently checking this issue.
>
> Best regards,
>
> Lenny.
>
>
>
>
>
> On 10/7/08, Jeff Squyres <jsquyres_at_[hidden]> wrote:
>>
>> FWIW, if this configuration is for all of your users, you might want to
>> specify these MCA params in the default MCA param file, or the environment,
>> ...etc. Just so that you don't have to specify it on every mpirun command
>> line.
>>
>> See http://www.open-mpi.org/faq/?category=tuning#setting-mca-params.
>>
>>
>> On Oct 7, 2008, at 5:43 AM, Lenny Verkhovsky wrote:
>>
>> Sorry, misunderstood the question,
>>>
>>> thanks for Pasha the right command line will be
>>>
>>> -mca btl openib,self -mca btl_openib_of_pkey_val 0x8109 -mca
>>> btl_openib_of_pkey_ix 1
>>>
>>> ex.
>>>
>>> #mpirun -np 2 -H witch2,witch3 -mca btl openib,self -mca
>>> btl_openib_of_pkey_val 0x8001 -mca btl_openib_of_pkey_ix 1 ./mpi_p1_4_TRUNK
>>> -t lt
>>> LT (2) (size min max avg) 1 3.443480 3.443480 3.443480
>>>
>>>
>>> Best regards
>>>
>>> Lenny.
>>>
>>>
>>> On 10/6/08, Jeff Squyres <jsquyres_at_[hidden]> wrote: On Oct 5, 2008, at
>>> 1:22 PM, Lenny Verkhovsky wrote:
>>>
>>> you should probably use -mca tcp,self -mca btl_openib_if_include
>>> ib0.8109
>>>
>>>
>>> Really? I thought we only took OpenFabrics device names in the
>>> openib_if_include MCA param...? It looks like ib0.8109 is an IPoIB device
>>> name.
>>>
>>>
>>>
>>> Lenny.
>>>
>>>
>>>
>>> On 10/3/08, Matt Burgess <burgess.matt_at_[hidden]> wrote:
>>> Hi,
>>>
>>>
>>> I'm trying to get openmpi working over openib partitions. On this
>>> cluster, the partition number is 0x109. The ib interfaces are pingable over
>>> the appropriate ib0.8109 interface:
>>>
>>> d2:/opt/openmpi-ib # ifconfig ib0.8109
>>> ib0.8109 Link encap:UNSPEC HWaddr
>>> 80-00-00-4A-FE-80-00-00-00-00-00-00-00-00-00-00
>>> inet addr:10.21.48.2 Bcast:10.21.255.255 Mask:255.255.0.0
>>> inet6 addr: fe80::202:c902:26:ca01/64 Scope:Link
>>> UP BROADCAST RUNNING MULTICAST MTU:65520 Metric:1
>>> RX packets:16811 errors:0 dropped:0 overruns:0 frame:0
>>> TX packets:15848 errors:0 dropped:1 overruns:0 carrier:0
>>> collisions:0 txqueuelen:256
>>> RX bytes:102229428 (97.4 Mb) TX bytes:102324172 (97.5 Mb)
>>>
>>>
>>> I have tried the following:
>>>
>>> /opt/openmpi-ib/1.2.6/bin/mpirun -np 2 -machinefile machinefile -mca btl
>>> openib,self -mca btl_openib_max_btls 1 -mca btl_openib_ib_pkey_val 0x8109
>>> -mca btl_openib_ib_pkey_ix 1 /cluster/pallas/x86_64-ib/IMB-MPI1
>>>
>>> but I just get a RETRY EXCEEDED ERROR. Is there a MCA parameter I am
>>> missing?
>>>
>>> I was successful using tcp only:
>>>
>>> /opt/openmpi-ib/1.2.6/bin/mpirun -np 2 -machinefile machinefile -mca btl
>>> tcp,self -mca btl_openib_max_btls 1 -mca btl_openib_ib_pkey_val 0x8109
>>> /cluster/pallas/x86_64-ib/IMB-MPI1
>>>
>>>
>>>
>>> Thanks,
>>> Matt Burgess
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>> --
>>> Jeff Squyres
>>> Cisco Systems
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>
>> --
>> Jeff Squyres
>> Cisco Systems
>>
>>
>