Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] [OMPI users] OpenMPI with openib partitions
From: Lenny Verkhovsky (lenny.verkhovsky_at_[hidden])
Date: 2008-10-07 09:37:25


Hi Matt,

It seems that the right way to do it is the fallowing:

-mca btl openib,self -mca btl_openib_ib_pkey_val 33033

when the value is a decimal number of the pkey, in your case 0x8109 = 33033,
and no need for btl_openib_ib_pkey_ix value.

ex.

mpirun -np 2 -H witch2,witch3 -mca btl openib,self -mca
btl_openib_ib_pkey_val 32769 ./mpi_p1_4_1_2 -t lt
LT (2) (size min max avg) 1 3.511429 3.511429 3.511429

if it's not working check cat /sys/class/infiniband/mthca0/ports/1/pkeys/*
for pkeys ans SM, maybe it's a setup.

Pasha is currently checking this issue.

Best regards,

Lenny.

On 10/7/08, Jeff Squyres <jsquyres_at_[hidden]> wrote:
>
> FWIW, if this configuration is for all of your users, you might want to
> specify these MCA params in the default MCA param file, or the environment,
> ...etc. Just so that you don't have to specify it on every mpirun command
> line.
>
> See http://www.open-mpi.org/faq/?category=tuning#setting-mca-params.
>
>
> On Oct 7, 2008, at 5:43 AM, Lenny Verkhovsky wrote:
>
> Sorry, misunderstood the question,
>>
>> thanks for Pasha the right command line will be
>>
>> -mca btl openib,self -mca btl_openib_of_pkey_val 0x8109 -mca
>> btl_openib_of_pkey_ix 1
>>
>> ex.
>>
>> #mpirun -np 2 -H witch2,witch3 -mca btl openib,self -mca
>> btl_openib_of_pkey_val 0x8001 -mca btl_openib_of_pkey_ix 1 ./mpi_p1_4_TRUNK
>> -t lt
>> LT (2) (size min max avg) 1 3.443480 3.443480 3.443480
>>
>>
>> Best regards
>>
>> Lenny.
>>
>>
>> On 10/6/08, Jeff Squyres <jsquyres_at_[hidden]> wrote: On Oct 5, 2008, at
>> 1:22 PM, Lenny Verkhovsky wrote:
>>
>> you should probably use -mca tcp,self -mca btl_openib_if_include ib0.8109
>>
>>
>> Really? I thought we only took OpenFabrics device names in the
>> openib_if_include MCA param...? It looks like ib0.8109 is an IPoIB device
>> name.
>>
>>
>>
>> Lenny.
>>
>>
>> On 10/3/08, Matt Burgess <burgess.matt_at_[hidden]> wrote:
>> Hi,
>>
>>
>> I'm trying to get openmpi working over openib partitions. On this cluster,
>> the partition number is 0x109. The ib interfaces are pingable over the
>> appropriate ib0.8109 interface:
>>
>> d2:/opt/openmpi-ib # ifconfig ib0.8109
>> ib0.8109 Link encap:UNSPEC HWaddr
>> 80-00-00-4A-FE-80-00-00-00-00-00-00-00-00-00-00
>> inet addr:10.21.48.2 Bcast:10.21.255.255 Mask:255.255.0.0
>> inet6 addr: fe80::202:c902:26:ca01/64 Scope:Link
>> UP BROADCAST RUNNING MULTICAST MTU:65520 Metric:1
>> RX packets:16811 errors:0 dropped:0 overruns:0 frame:0
>> TX packets:15848 errors:0 dropped:1 overruns:0 carrier:0
>> collisions:0 txqueuelen:256
>> RX bytes:102229428 (97.4 Mb) TX bytes:102324172 (97.5 Mb)
>>
>>
>> I have tried the following:
>>
>> /opt/openmpi-ib/1.2.6/bin/mpirun -np 2 -machinefile machinefile -mca btl
>> openib,self -mca btl_openib_max_btls 1 -mca btl_openib_ib_pkey_val 0x8109
>> -mca btl_openib_ib_pkey_ix 1 /cluster/pallas/x86_64-ib/IMB-MPI1
>>
>> but I just get a RETRY EXCEEDED ERROR. Is there a MCA parameter I am
>> missing?
>>
>> I was successful using tcp only:
>>
>> /opt/openmpi-ib/1.2.6/bin/mpirun -np 2 -machinefile machinefile -mca btl
>> tcp,self -mca btl_openib_max_btls 1 -mca btl_openib_ib_pkey_val 0x8109
>> /cluster/pallas/x86_64-ib/IMB-MPI1
>>
>>
>>
>> Thanks,
>> Matt Burgess
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> --
>> Jeff Squyres
>> Cisco Systems
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>
> --
> Jeff Squyres
> Cisco Systems
>
>