Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: [OMPI devel] 32-bit openib is broken on the trunk as of Nov 27th, r16799
From: Tim Mattox (timattox_at_[hidden])
Date: 2007-12-05 14:45:17


Hello,
It appears that sometime after r16777, and by r16799, that something
was broken on the trunk's openib support for 32-bit builds.
The 64-bit tests all seem normal, as well as the 32-bit & 64-bit tests on
the 1.2 branch on the same machine (odin).

See this MTT results page permalink showing the 32-bit odin runs:
http://www.open-mpi.org/mtt/index.php?do_redir=468

Pasha & Gleb, you both did a variety of checkins in that svn r# range.
Do either of you have time to investigate this?

Here is a snippet from one randomly picked failed test (out of thousands):
[1,1][btl_openib_component.c:1665:btl_openib_module_progress] from
odin001 to: odin001 error
polling LP CQ with status LOCAL PROTOCOL ERROR status number 4 for
wr_id 141733120 opcode 128
qp_idx 3
--------------------------------------------------------------------------
mpirun has exited due to process rank 1 with PID 29761 on
node odin001 calling "abort". This will have caused other processes
in the application to be terminated by signals sent by mpirun
(as reported here).
--------------------------------------------------------------------------

Thanks, and happy bug hunting!

-- 
Tim Mattox, Ph.D. - http://homepage.mac.com/tmattox/
 tmattox_at_[hidden] || timattox_at_[hidden]
    I'm a bright... http://www.the-brights.net/