Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] OpenIB Error in ibv_create_srq
From: Terry Dontje (terry.dontje_at_[hidden])
Date: 2010-08-04 12:03:28


Allen Barnett wrote:
> Thanks for the pointer!
>
> Do you know if these sizes are dependent on the hardware?
>
They can be, the following file sets up the defaults for some known cards:

ompi/mca/btl/openib/mca-btl-openib-device-params.ini

--td
> Thanks,
> Allen
>
> On Tue, 2010-08-03 at 10:29 -0400, Terry Dontje wrote:
>
>> Sorry, I didn't see your prior question glad you found the
>> btl_openib_receive_queues parameter. There is not a faq entry for
>> this but I found the following in the openib btl help file that spells
>> out the parameters when using Per-peer receive queue (ie receive queue
>> setting with "P" as the first argument).
>>
>> Per-peer receive queues require between 2 and 5 parameters:
>>
>> 1. Buffer size in bytes (mandatory)
>> 2. Number of buffers (mandatory)
>> 3. Low buffer count watermark (optional; defaults to (num_buffers /
>> 2))
>> 4. Credit window size (optional; defaults to (low_watermark / 2))
>> 5. Number of buffers reserved for credit messages (optional;
>> defaults to (num_buffers*2-1)/credit_window)
>>
>> Example: P,128,256,128,16
>> - 128 byte buffers
>> - 256 buffers to receive incoming MPI messages
>> - When the number of available buffers reaches 128, re-post 128 more
>> buffers to reach a total of 256
>> - If the number of available credits reaches 16, send an explicit
>> credit message to the sender
>> - Defaulting to ((256 * 2) - 1) / 16 = 31; this many buffers are
>> reserved for explicit credit messages
>>
>> --td
>> Allen Barnett wrote:
>>
>>> Hi: In response to my own question, by studying the file
>>> mca-btl-openib-device-params.ini, I discovered that this option in
>>> OMPI-1.4.2:
>>>
>>> -mca btl_openib_receive_queues P,65536,256,192,128
>>>
>>> was sufficient to prevent OMPI from trying to create shared receive
>>> queues and allowed my application to run to completion using the IB
>>> hardware.
>>>
>>> I guess my question now is: What do these numbers mean? Presumably the
>>> size (or counts?) of buffers to allocate? Are there limits or a way to
>>> tune these values?
>>>
>>> Thanks,
>>> Allen
>>>
>>> On Mon, 2010-08-02 at 12:49 -0400, Allen Barnett wrote:
>>>
>>>
>>>> Hi Terry:
>>>> It is indeed the case that the openib BTL has not been initialized. I
>>>> ran with your tcp-disabled MCA option and it aborted in MPI_Init.
>>>>
>>>> The OFED stack is what's included in RHEL4. It appears to be made up of
>>>> the RPMs:
>>>> openib-1.4-1.el4
>>>> opensm-3.2.5-1.el4
>>>> libibverbs-1.1.2-1.el4
>>>>
>>>> How can I determine if srq is supported? Is there an MCA option to
>>>> defeat it? (Our in-house cluster has more recent Mellanox IB hardware
>>>> and is running this same IB stack and ompi 1.4.2 works OK, so I suspect
>>>> srq is supported by the OpenFabrics stack. Perhaps.)
>>>>
>>>> Thanks,
>>>> Allen
>>>>
>>>> On Mon, 2010-08-02 at 06:47 -0400, Terry Dontje wrote:
>>>>
>>>>
>>>>> My guess is from the message below saying "(openib) BTL failed to
>>>>> initialize" that the code is probably running over tcp. To
>>>>> absolutely prove this you can specify to only use the openib, sm and
>>>>> self btls to eliminate the tcp btl. To do that you add the following
>>>>> to the mpirun line "-mca btl openib,sm,self". I believe with that
>>>>> specification the code will abort and not run to completion.
>>>>>
>>>>> What version of the OFED stack are you using? I wonder if srq is
>>>>> supported on your system or not?
>>>>>
>>>>> --td
>>>>>
>>>>> Allen Barnett wrote:
>>>>>
>>>>>
>>>>>> Hi: A customer is attempting to run our OpenMPI 1.4.2-based application
>>>>>> on a cluster of machines running RHEL4 with the standard OFED stack. The
>>>>>> HCAs are identified as:
>>>>>>
>>>>>> 03:01.0 PCI bridge: Mellanox Technologies MT23108 PCI Bridge (rev a1)
>>>>>> 04:00.0 InfiniBand: Mellanox Technologies MT23108 InfiniHost (rev a1)
>>>>>>
>>>>>> ibv_devinfo says that one port on the HCAs is active but the other is
>>>>>> down:
>>>>>>
>>>>>> hca_id: mthca0
>>>>>> fw_ver: 3.0.2
>>>>>> node_guid: 0006:6a00:9800:4c78
>>>>>> sys_image_guid: 0006:6a00:9800:4c78
>>>>>> vendor_id: 0x066a
>>>>>> vendor_part_id: 23108
>>>>>> hw_ver: 0xA1
>>>>>> phys_port_cnt: 2
>>>>>> port: 1
>>>>>> state: active (4)
>>>>>> max_mtu: 2048 (4)
>>>>>> active_mtu: 2048 (4)
>>>>>> sm_lid: 1
>>>>>> port_lid: 26
>>>>>> port_lmc: 0x00
>>>>>>
>>>>>> port: 2
>>>>>> state: down (1)
>>>>>> max_mtu: 2048 (4)
>>>>>> active_mtu: 512 (2)
>>>>>> sm_lid: 0
>>>>>> port_lid: 0
>>>>>> port_lmc: 0x00
>>>>>>
>>>>>>
>>>>>> When the OMPI application is run, it prints the error message:
>>>>>>
>>>>>> --------------------------------------------------------------------
>>>>>> The OpenFabrics (openib) BTL failed to initialize while trying to
>>>>>> create an internal queue. This typically indicates a failed
>>>>>> OpenFabrics installation, faulty hardware, or that Open MPI is
>>>>>> attempting to use a feature that is not supported on your hardware
>>>>>> (i.e., is a shared receive queue specified in the
>>>>>> btl_openib_receive_queues MCA parameter with a device that does not
>>>>>> support it?). The failure occured here:
>>>>>>
>>>>>> Local host: machine001.lan
>>>>>> OMPI
>>>>>> source: /software/openmpi-1.4.2/ompi/mca/btl/openib/btl_openib.c:250
>>>>>> Function: ibv_create_srq()
>>>>>> Error: Invalid argument (errno=22)
>>>>>> Device: mthca0
>>>>>>
>>>>>> You may need to consult with your system administrator to get this
>>>>>> problem fixed.
>>>>>> --------------------------------------------------------------------
>>>>>>
>>>>>> The full log of a run with "btl_openib_verbose 1" is attached. My
>>>>>> application appears to run to completion, but I can't tell if it's just
>>>>>> running on TCP and not using the IB hardware.
>>>>>>
>>>>>> I would appreciate any suggestions on how to proceed to fix this error.
>>>>>>
>>>>>> Thanks,
>>>>>> Allen
>>>>>>
>
>

-- 
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.650.633.7054
Oracle * - Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.dontje_at_[hidden] <mailto:terry.dontje_at_[hidden]>



picture