Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Error launching w/ 1.5.3 on IB mthca nodes
From: V. Ram (vramml0_at_[hidden])
Date: 2011-12-20 16:08:01


Hello,

On Mon, Dec 19, 2011, at 03:30 PM, Yevgeny Kliteynik wrote:
> Hi,
>
> What's the smallest number of nodes that are needed to reproduce this
> problem? Does it happen with just two HCAs, one process per node?

I believe so, but I will work with some users to verify this.

> Let's get you to the latest firmware GA of this card.
> Run "ibv_devinfo | grep board_id", and find the latest FW GA for
> your device here:
> http://www.mellanox.com/content/pages.php?pg=firmware_download
> It has all the instructions how to update FW.

I think we're here already.

The support link you posted above gives firmware version 4.8.200 for our
adapters (ibv_devinfo output posted below).

However, we're at 4.8.917 across all adapters.

http://www.mail-archive.com/ofw@lists.openfabrics.org/msg00686.html
gives the only info we can seem to find on that firmware version.

I believe the OFED 1.2 release came with this firmware file and update
tools for the HCA. Some of the nodes that were shipped to us came with
this firmware version onboard from the factory, so we updated the other
nodes to match.

For what it's worth, we saw these errors before and after the firmware
updates.

> Also, please post here some more information about your HCA
> ("ibv_devinfo" output should do).

ibv_devinfo output:

hca_id: mthca0
        transport: InfiniBand (0)
        fw_ver: 4.8.917
        node_guid: 0005:ad00:000b:5454
        sys_image_guid: 0005:ad00:0100:d050
        vendor_id: 0x05ad
        vendor_part_id: 25208
        hw_ver: 0xA0
        board_id: MT_00A0000001
        phys_port_cnt: 2
                port: 1
                        state: PORT_ACTIVE (4)
                        max_mtu: 2048 (4)
                        active_mtu: 2048 (4)
                        sm_lid: 2
                        port_lid: 45
                        port_lmc: 0x00
                        link_layer: IB

                port: 2
                        state: PORT_DOWN (1)
                        max_mtu: 2048 (4)
                        active_mtu: 512 (2)
                        sm_lid: 0
                        port_lid: 0
                        port_lmc: 0x00
                        link_layer: IB

Thanks.

-- 
http://www.fastmail.fm - Access all of your messages and folders
                          wherever you are