You may try to update your OFED version. I think 1.5.3 is the latest one.
Pavel (Pasha) Shamis
---
Application Performance Tools Group
Computer Science and Math Division
Oak Ridge National Laboratory
On Aug 25, 2011, at 7:46 PM, <worldeb_at_[hidden]> <worldeb_at_[hidden]> wrote:
>
> Hi all,
>
> it is more hardware or system configuration question but
> I hope people in this list have an experience.
> I have just added new ConnectX IB card to cluster with InfiniHost cards.
> And no mpi programs work. Even ofed's tests do not work.
> For example ib_send_*, ib_write_* just segfault on the host with ConnectX card and
> still wait on the hosts with InfiniHost card. rdma_lat/bw tests segfault too but
> with messages on the InfiniHost card hosts like this:
> server read: No such file or directory
> 5924:pp_server_exch_dest: 0/45 Couldn't read remote address
>
> pp_read_keys: No such file or directory
> Couldn't read remote address
>
> Other diagnostic tools like ibv_device, ibchecknet, ibstat, ibstatus... show no errors
> and show ConnectX card in system. All modules (mlx4_*, rdma_*) loaded. IPoIB configured.
> openibd, opensmd services started without errors.
>
> 08:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev a0)
> OFED is 1.3.1, CentOS 5.2.
>
> ibstat
> CA 'mlx4_0'
> CA type: MT26428
> Number of ports: 1
> Firmware version: 2.7.0
> Hardware version: a0
> Node GUID: 0x0002c903000cad14
> System image GUID: 0x0002c903000cad17
> Port 1:
> State: Active
> Physical state: LinkUp
> Rate: 20
> Base lid: 60
> LMC: 0
> SM lid: 60
> Capability mask: 0x0251086a
> Port GUID: 0x0002c903000cad15
>
> Where is a problem?
>
> Thanx in advance,
> Egor.
> _______________________________________________
> users mailing list
> users_at_[hidden]
> hxxp://www.open-mpi.org/mailman/listinfo.cgi/users
>
|