Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: [OMPI users] mlx4 error - looking for guidance
From: Jeff Layton (laytonjb_at_[hidden])
Date: 2009-03-04 21:34:10

Evening everyone,

I'm running a CFD code on IB and I've encountered an error I'm not sure about and I'm looking for some guidance on where to start looking. Here's the error:

mlx4: local QP operation err (QPN 260092, WQE index 9a9e0000, vendor syndrome 6f, opcode = 5e)
[0,1,6][btl_openib_component.c:1392:btl_openib_component_progress] from compute-2-0.local to: compute-2-0.local erro
r polling HP CQ with status LOCAL QP OPERATION ERROR status number 2 for wr_id 37742320 opcode 0
mpirun noticed that job rank 0 with PID 21220 on node compute-2-0.local exited on signal 15 (Terminated).
78 additional processes aborted (not shown)

This is openmpi-1.2.9rc2 (sorry - need to upgrade to 1.3.0). The code works correctly for smaller cases, but when I run larger cases I get this error.

I'm heading to bed but I'll check email tomorrow (so to sleep and run but it's been a long day).