Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] ConnectX with InfiniHost IB HCAs
From: Shamis, Pavel (shamisp_at_[hidden])
Date: 2011-08-26 09:42:32


You may try to update your OFED version. I think 1.5.3 is the latest one.

Pavel (Pasha) Shamis

---
Application Performance Tools Group
Computer Science and Math Division
Oak Ridge National Laboratory
On Aug 25, 2011, at 7:46 PM, <worldeb_at_[hidden]> <worldeb_at_[hidden]> wrote:
> 
> Hi all,
> 
> it is more hardware or system configuration question but 
> I hope people in this list have an experience.
> I have just added new ConnectX IB card to cluster with InfiniHost cards.
> And no mpi programs work. Even ofed's tests do not work.
> For example ib_send_*, ib_write_* just segfault on the host with ConnectX card and 
> still wait on the hosts with InfiniHost card. rdma_lat/bw tests segfault too but
> with messages on the InfiniHost card hosts like this:
> server read: No such file or directory
> 5924:pp_server_exch_dest: 0/45 Couldn't read remote address
> 
> pp_read_keys: No such file or directory
> Couldn't read remote address
> 
> Other diagnostic tools like ibv_device, ibchecknet, ibstat, ibstatus... show no errors
> and show ConnectX card in system. All modules (mlx4_*, rdma_*) loaded. IPoIB configured.
> openibd, opensmd services started without errors.
> 
> 08:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev a0)
> OFED is 1.3.1, CentOS 5.2.
> 
> ibstat
> CA 'mlx4_0'
>        CA type: MT26428
>        Number of ports: 1
>        Firmware version: 2.7.0
>        Hardware version: a0
>        Node GUID: 0x0002c903000cad14
>        System image GUID: 0x0002c903000cad17
>        Port 1:
>                State: Active
>                Physical state: LinkUp
>                Rate: 20
>                Base lid: 60
>                LMC: 0
>                SM lid: 60
>                Capability mask: 0x0251086a
>                Port GUID: 0x0002c903000cad15
> 
> Where is a problem?
> 
> Thanx in advance,
> Egor.
> _______________________________________________
> users mailing list
> users_at_[hidden]
> hxxp://www.open-mpi.org/mailman/listinfo.cgi/users
>