Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Problem with openmpi and infiniband
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2009-01-12 12:48:41

On Jan 7, 2009, at 6:28 PM, Biagio Lucini wrote:

> [[5963,1],13][btl_openib_component.c:2893:handle_wc] from node24 to:
> node11 error polling LP CQ with status RECEIVER NOT READY RETRY
> EXCEEDED ERROR status number 13 for wr_id 37779456 opcode 0 qp_idx 0

Ah! If we're dealing a RNR retry exceeded, this is *usually* a
physical layer problem on the IB fabric.

Have you run a complete layer 0 / physical set of diagnostics on the
fabric to know that it is completely working properly?

Jeff Squyres
Cisco Systems