Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] jobs are hanging with btl_openib_component error
From: Jeff Squyres (jsquyres) (jsquyres_at_[hidden])
Date: 2013-06-17 09:41:22


That sounds like there's a problem with your InfiniBand fabric.

You should run a complete level-0 diagnostic on your IB network.

On Jun 17, 2013, at 5:23 AM, "Singh, Bharati (GE Global Research, consultant)" <Bharati.Singh_at_[hidden]> wrote:

> Hi Team,
>
> Our users jobs are hanging and we notice below errors.
>
> [[61410,1],65][btl_openib_component.c:3238:handle_wc] from bng1aviationdc22 to: bng1aviationdc26 error polling LP CQ with status RETRY EXCEEDED ERROR status number 12 for wr_id 774739584 opcode 1 vendor error 129 qp_idx 0
>
> PFA file for more information.
>
> Thanks,
> Bharati Singh
> *****************************************************************************
> ** **
> ** WARNING: This email contains an attachment of a very suspicious type. **
> ** You are urged NOT to open this attachment unless you are absolutely **
> ** sure it is legitimate. Opening this attachment may cause irreparable **
> ** damage to your computer and your files. If you have any questions **
> ** about the validity of this message, PLEASE SEEK HELP BEFORE OPENING IT. **
> ** **
> ** This warning was added by the IU Computer Science Dept. mail scanner. **
> *****************************************************************************
>
>
> <output.14807.zip>_______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/