Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] mlx4 error - looking for guidance
From: Pavel Shamis (Pasha) (pashash_at_[hidden])
Date: 2009-03-05 03:30:36


Jeff,
Can you please provide more information about you HCA type (ibv_devinfo -v).
Do you see this error immediate during startup, or you get it during
your run ?

Thanks,
Pasha

Jeff Layton wrote:
> Evening everyone,
>
> I'm running a CFD code on IB and I've encountered an error I'm not
> sure about and I'm looking for some guidance on where to start
> looking. Here's the error:
>
> mlx4: local QP operation err (QPN 260092, WQE index 9a9e0000, vendor
> syndrome 6f, opcode = 5e)
> [0,1,6][btl_openib_component.c:1392:btl_openib_component_progress]
> from compute-2-0.local to: compute-2-0.local erro
> r polling HP CQ with status LOCAL QP OPERATION ERROR status number 2
> for wr_id 37742320 opcode 0
> mpirun noticed that job rank 0 with PID 21220 on node
> compute-2-0.local exited on signal 15 (Terminated).
> 78 additional processes aborted (not shown)
>
>
> This is openmpi-1.2.9rc2 (sorry - need to upgrade to 1.3.0). The code
> works correctly for smaller cases, but when I run larger cases I get
> this error.
>
> I'm heading to bed but I'll check email tomorrow (so to sleep and run
> but it's been a long day).
>
> TIA!
>
> Jeff
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users