Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] Crash in code using OMPI 1.2.7 - Debugging assistance sought
From: V. Ram (v_r_959_at_[hidden])
Date: 2008-09-24 16:03:27


I have a user running a Fortran code that can be built and run on on
both 32-bit and 64-bit architectures. When this code is built for the
x86-64 machines in our cluster, running on OMPI 1.2.7, it runs fine.
However, if we build and run it on 32-bit x86 machines, also running the
same GNU/Linux distribution and also with OMPI 1.2.7, it crashes with
errors like:

mca_btl_tcp_frag_recv: readv failed with errno=110
mca_btl_tcp_frag_recv: readv failed with errno=104

We have tried different Fortran compilers (both PathScale and gfortran)
and keep getting these crashes, which occur after varying numbers of
iterations. Running on a single node using MPI seems to work OK.

Are there any suggestions on how to figure out if it's a problem with
the code or the OMPI installation/software on the system? We have tried
"--debug-daemons" with no new/interesting information being revealed.
Is there a way to trap segfault messages or more detailed MPI
transaction information or anything else that could help diagnose this?


  V. Ram
-- - Access all of your messages and folders
                          wherever you are